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1.0  Introduction 

The  goal  of  most  researchers  studying  “how  to  see"  is  object  recognition.  The  ap¬ 
proach  includes  separating  a  scene  into  objects,  describing  these  objects  in  terms  of 
some  model  such  as  generalized  cones  (Binford,  1971;  Marr,l982),  and  then  match¬ 
ing  the  result  to  a  canonical  description  in  memory  (see  Ballard  it  Brown,  1982). 
Surprisingly,  the  most  progress  has  been  made  in  the  last  two  steps,  whereas  the 
initial  first  step  of  identifying  objects  in  a  scene  is  still  largely  unsolved  in  computer 
vision  and  is  not  understood  at  all  for  biological  systems,  except  in  special  cases 
where  an  isolated  object  moves  against  a  stationary  background,  or  is  distinguished 
in  three-dimensions  with  no  occlusions  present. 

Our  work  differs  from  most  in  vogue  today  by  suggesting  that  a  scene  be 
separated  into  objects  not  by  edge  extraction  followed  by  contour  grouping,  but 
rather  by  recovering  regions  of  differing  material  qualities.  This  approach  was 
attempted  early  in  scene  analysis  (Ballard  ic  Brown, 1982),  but  then  was  based  only 
on  simple  image  features  such  as  color  and  texture.  Although  a  useful  beginning, 
a  more  reliable  method  is  to  use  the  image  features  to  infer  material  properties  of 
the  world  (e.g.  water,  cloud,  tree,  grass,  wood,  asphalt,  etc.)  and  to  break  up  the 
scene  into  its  “stuff” .  Such  world-based  properties  are  stable  and  survive  changes 
in  illumination,  viewpoint,  shading,  surface  orientation,  etc.,  which  confounded 
earlier  image-based  attempts  at  scene  analysis  (Heeger  &  Pentland  1987;  Kube  &c 
Pentland,  1986). 

By  identifying  the  “stuff”  a  region  is  made  of,  we  also  gain  the  additional 
benefit  of  knowing  something  about  what  object  that  region  might  correspond  to. 
For  example,  hair  implies  an  animal,  feathers  .  I  rd,  grass  a  surface  which  is  part 
of  the  terrain  (Richards,  1982;  Richards  ii  Boo  V  1988). 

Our  major  findings  are  : 

•  Useful  material  descriptions  must  include  both  shape  and  reflectance  predi¬ 
cates.  (In  many  cases,  dynamics  is  also  needed.)  Shape  topology  is  far  more 
important  than  precise  metrics. 

•  The  Cook-Torrance  reflectance  model  can  be  simplified  to  a  three-dimensional 
system  for  purposes  of  rendition  and  image  understanding. 

•  The  use  of  Graphics  Psychophysics  (i.e.  image  analysis  by  generating  synthetic 
images)  is  a  very  powerful  but  time-expensive  tool  for  image  understanding. 

1  The  frontispiece  is  a  composite  of  computer-generated  surfaces  and  pictures  taken  from 
Brodatz  (1966).  The  textures  for  all  these  surfaces  can  be  modeled  by  fractal  processes. 
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a  Angle  between  N  and  H 
6  Angle  between  L  and  H  or  V  and  H 
X  Wavelength 

D  Facet  slope  distribution  (unction 
d  Fraction  of  reflection  that  is  diffuse 
du,  Solid  angle  of  a  beam  of  incident  light 
E,  Energy  of  the  incident  light 
F  Reflectance  of  a  perfectly  smooth  surface 
/  Unblocked  fraction  of  the  hemisphere 
C  Geometrical  attenuation  factor 
H  Unit  angular  bisector  of  V  and  L 
/,  Average  intensity  of  the  incident  light 
Im  Intensity  of  the  incident  ambient  light 
/,  Intensity  of  the  reflected  light 
lm  Intensity  of  the  reflected  ambient  light 
k  Extinction  coefficient 
L  Unit  vector  in  the  direction  of  a  light 
m  Root  mean  square  slope  of  facets 
N  Unit  surface  normal 
n  Index  of  refraction 
R „  Ambient  reflectance 
R  Total  bidirectional  reflectance 
Rj  Diffuse  bidirectional  reflectance 
R,  Specular  bidirectional  reflectance 
s  Fraction  of  reflectance  that  is  specular 
V  Unit  vector  in  direction  of  the  viewer 
w  Relative  weight  of  a  facet  slope 

Figure  1  Parameters  of  the  Cook  <k  Torrance  (1982)  model  for  surface  re¬ 
flectance. 

•  We  present  the  first  study  in  computational  acoustics  showing  how  material 
properties  can  be  recovered  from  sounds. 

•  More  work  is  needed  to  understand  the  useful  image  correlates  of  the  motion 
of  non-rigid  objects,  especially  those  materials  without  surface  markers  such 
as  rubber  or  water. 

2.0  The  Problem 

Recovering  material  properties  from  image  information  is  difficult  because  the  avail¬ 
able  signal,  namely  the  image  intensity,  is  a  function  of  severed  factors  which  are 
confounded.  For  example,  the  observed  intensity  (flux)  /  of  a  point  on  the  retina 
depends  upon  (l)  the  spectral  reflectance  (albedo)  of  the  surface,  (p(A)),  (2)  the 
strength  and  spectral  composition  of  the  illuminant  E( A),  (3)  the  geometry  and 
structure  of  the  three-dimensional  surface  N(ff,  r)  relative  to  the  light  source  direc¬ 
tion  L  and  the  viewing  position  V,  and  finally  (4)  the  reflecting  properties  of  the 
material  R.  In  particular, 
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I(x,y,\)  =  p(\)*E(\)*(NL)*R  (l) 

Thus,  if  one  wishes  to  “know”  what  kind  of  pigment  p( A)  makes  up  the  sur¬ 
face  (chlorophyll,  xanthophyll,  fiavanoid,  ferrite,  etc.)  the  observer  must  somehow 
discount  the  illuminant  E  (including  shadows),  the  surface  orientation  N,  the  illu- 
minant  direction  L,  his  particular  viewing  position  V,  and  any  specular  component 
of  the  surface.  (The  latter  component  is  embedded  in  R,  whose  complete  descrip¬ 
tion  requires  many  of  the  additional  parameters  defined  in  the  table  of  Figure  1.) 
Obviously  with  so  many  unknowns,  any  single  intensity  value  is  insufficient  (Horn, 
1977,  1987).  Rather,  several  different  regions  in  the  image  of  an  object  must  be 
examined  for  evidence  that  will  disambiguate  among  all  the  possible  origins  of  the 
image  intensity  values  (Richards,  1988).  For  example,  to  tell  whether  an  urn  is 
made  of  brass,  or  rubber,  one  must  “inspect  it”  and  integrate  information  from  the 
highlight,  the  occluding  boundary  at  two  or  more  locations,  the  intensities  arising 
from  shaded  regions,  etc. 

Considering  the  complexity  of  the  interrelated  unknowns,  the  problem  of  recov¬ 
ering  the  material  property  from  image  intensities  seems  impossible.  Yet  observers 
can  indeed  distinguish  between  materials  such  as  metal,  plastic,  rubber,  ceramic, 
etc.,  even  when  all  other  factors  such  as  object  shape  and  illumination  are  constant 
(see  illustrations  of  Cook  It  Torrance,  1982;  Richards  It  Ullman,  1987).  On  what 
basis  are  these  inferences  made  ? 

3.0  The  Approach:  Graphics  Psychophysics 

Obviously  an  observer  can  not  recover  all  the  physical  parameters  of  a  surface.  Only 
some  of  these  are  relevant  to  the  inference  process.  Our  method  for  identifying  these 
relevant  parameters  is  simple:  we  use  computer  graphics  to  generate  an  image  that 
“looks  like”  the  real  surface. 

Figure  2  illustrates  in  more  detail  the  role  “analysis-by-synthesis”  can  play 
in  Image  Understanding.  When  a  scene  is  imaged  upon  our  retina,  it  becomes  a 
pattern  of  receptor  activities,  as  shown  in  the  lower  right  of  the  figure.  This  is  what 
we  are  given,  namely  a  2D  array  of  numbers.  Our  task  is  to  infer  the  structure  and 
stuff  of  the  world,  as  shown  by  the  Inference  arrow  on  the  upper  right.  We  face 
two  problems.  First,  we  need  to  know  the  numbers  in  the  array.  Second,  we  need 
to  know  how  the  numbers  “got  there”. 

For  any  given  scene,  finding  the  numbers  in  the  image  array  is  trivial.  Simply 
take  a  picture  with  a  vidicon  and  read  the  numbers  out  of  frame  buffer.  We  can 
check  to  be  sure  these  numbers  are  correct  by  displaying  the  contents  of  the  buffer 
to  see  if  the  2D  display  looks  like  the  reed  3D  scene.  In  terms  of  Figure  2,  the 
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Figure  2  The  role  of  computer  graphics. 


pattern  of  numbers  in  the  display’s  buffer  then  matches  that  of  the  image  of  the 
scene  (or  at  least  there  is  a  one-to-one  mapping). 

The  tough  problem  is  the  second  one,  determining  why  the  3D  surfaces  in  the 
world  led  to  the  particular  array  of  numbers.  Without  knowledge  of  this  mapping, 
the  inference  problem  can’t  be  solved,  for  we  have  not  specified  the  function  we 
wish  to  invert.  In  terms  of  our  figure,  we  need  to  know  how  to  load  the  frame 
buffer  to  duplicate  the  real  3D  scene.  This  is  a  problem  in  computer  graphics. 
The  generation  of  a  realistic-looking  cloud  or  mountain  forces  the  programmer 
to  model  the  world,  to  make  explicit  the  optical  properties  of  a  surface  that  are 
psychophysically  relevant  (Blinn,  1977,  1982).  The  more  realistic  the  display,  the 
closer  the  programmer’s  model  captures  the  relevant  properties  and  geometery  of 
the  surface.  Specifically,  we  know  the  origin  or  cause  of  each  image  intensity  element 
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J(x,  y,  A,  t)  on  the  retina  of  our  eye,  for  we  have  specified  function  explicitly  in  our 
program.  Hence  the  mapping  from  the  world  to  the  image  plane  is  known.  It  then 
becomes  apparent  what  external  physical  properties  underly  the  observed  property 
of  “metal”  or  “rubber”,  as  in  the  Cook  k  Torrance  reflectance  model  (1982),  or 
the  rough  surface  of  a  mountain,  as  in  the  fractal  models  of  Carpenter  (1980),  Voss 
(1981)  and  especially  Pentland  (1984).  Computer  graphics  thus  provides  us  with 
the  opportunity  to  complete  our  understanding  of  the  mapping  of  the  world  into 
the  image.  Once  understood,  possible  inverse  mappings  and  constraints  can  be 
intelligently  explored.  Inferences  from  images  can  proceed. 
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4.0  Inferring  Material  Types 

In  the  sections  to  follow,  we  illustrate  our  Graphics  Psychophysics  approach  with 
two  vision  examples,  inferring  “wood”  and  “water”.  Appendix  II  also  includes 
an  analogous  study  of  how  a  natural  property  might  be  recovered  using  acoustic 
information. 


4.1  Why  Wood  Looks  Like  Wood  (D.  Honig) 

1.0  Introduction 

The  texture  and  pattern  of  a  surface  provides  a  major  source  of  visual  information 
about  the  material  or  “stuff”  we  are  looking  at.  Once  the  type  of  material  is 
classified,  a  host  of  other  facts  can  be  accessed,  such  as  the  expected  rigidity  of  the 
surface,  its  hardness,  its  density — or  other  properties  that  may  guide  us  in  object 
recognition  or  manipulation.  For  the  human  observer,  the  classification  of  materials 
using  optical  information  is  almost  automatic  and  consequently  seems  trivial.  Do 
we  simply  store  a  vast  collection  of  textures  typical  of  surfaces  of  interest,  learning 
by  example  the  proper  associations?  Or,  alternatively,  do  we  first  infer  from  the 
optical  array  the  type  of  process  which  created  the  pattern,  and  then  proceed  with 
our  inference?  Here,  we  take  the  latter  approach:  how  can  the  process  which 
created  a  visible  surface  structure  be  inferred?  We  illustrate  this  approach  using 
the  pattern  created  by  wood  grain. 

2.0  The  Method 

To  understand  why  wood  looks  like  wood,  we  begin  by  creating  a  rendition  of  wood, 
using  computer  graphics.  This  rendition  is  not  just  a  texture  mapping  or  “painting” 
of  a  pattern  onto  a  surface,  but  is  actually  a  three-dimensional  construction  based 
upon  tree  structure  and  how  trees  are  cut  up  into  lumber.  Trees  are  a  natural 
category  of  plant  with  certain  family  traits.  Common  structural  properties  of  trees 
combine  with  the  ways  in  which  lumber  is  commonly  cut  to  yield  a  pattern  of 
textural  elements  that  is  practically  unique  to  wood.  It  is  these  kinds  of  regularities 
that  allow  us  to  classify  things.  By  using  computer  graphics  to  create  a  3D  model 
of  the  tree  structure  and  then  by  cutting  this  model  structure,  we  can  identify  the 
origin  of  each  regularity  typical  of  the  wood  grain  pattern. 
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Figure  1  The  structure  of  a  tree  (from  Sligo,  1985). 

S.O  The  Processes 

As  previously  mentioned,  the  pattern  typical  of  wood  grain  depends  both  upon  the 
natural  structure  of  trees  and  upon  the  way  trees  are  cut  into  lumber.  We  discuss 
each  process  separately,  prior  to  presenting  our  model  of  these  processes. 

S.l  The  Tree 

A  tree  is  a  living  process,  absorbing  minerals  from  the  ground  and  photosynthetic 
energy  from  the  sun.  The  trunk  and  its  branches  are  cylindrical  structures  with 
three  main  parts:  the  cambium,  the  pores,  and  radial  compartments  (see  Figure  1). 

S.l. 2  The  Cambium 

The  cambium  is  the  only  live  part  of  the  tree  aside  from  the  leaves.  It  is  a  thin, 
vascular  layer  in  the  trunk  and  in  all  branches  and  roots.  Bark  protects  the  cambium 
on  the  outside.  (Here  we  neglect  the  bark,  which  is  stripped  prior  to  cutting 
boards.)  The  cylindrical  cambium  shell  grows  outward  by  depositing  concentric 
layers  of  xylem  (wood)  cells  on  its  inner  surface.  Fine  tubes  that  transport  water 
and  other  minerals  up  the  tree  are  interspersed  throughout  the  woody  tissue.  On 
the  outer  surface  of  the  cambium  layer  the  phloem  (inner  bark)  is  deposited,  which 
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Figure  2  Typical  ways  in  which  trees  are  cut  into  lumber. 

transports  photosynthetic  products  down  the  tree.  The  outer  surface  of  the  inner 
bark  is  the  bark  which  protects  the  tree. 

The  annual  “growth  rings”  of  a  tree  are  formed  because  as  the  cambium  grows 
outward,  depositing  new  wood  on  its  inner  surface,  the  new  tissue  gets  darker 
toward  the  end  of  the  growing  season.  As  winter  approaches,  the  tree  accretes  new 
wood  more  slowly,  so  the  natural  pigmentation  is  denser. 

8.1.8  The  Pores 

The  tree  transports  various  fluids  through  pores.  These  pores  are  simply  cellulose- 
lined  cells  or  fine,  tubular  structures  oriented  parallel  to  the  axis  of  the  trunk  or 
branch. 

8-H  Radial  Compartments  (Rays) 

Trees  defend  themselves  against  infection  by  partitioning,  walling-off  damaged  and 
infected  tissue.  Two  of  the  interior  walls  of  the  partitions  are  radial  planes  of 
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Figure  S  Several  special  cuts  of  a  tree  into  boards.  The  shrinkage  charac¬ 
teristics  of  lumber  depend  on  how  it  is  cut  (from  the  U.S.  Forest  Products 
Laboratory). 


parenchymal  tissue  distributed  among  the  woody  tissue.  Vertically,  trees  stop  the 
spread  of  disease  by  blocking  the  tubes  (pores)  passing  through  the  site  of  injury. 
Along  the  radius,  trees  wall-off  infection  with  the  annual  growth  rings,  whose  high 
density  serves  as  an  effective  barrier.  The  compartments  are  most  visible  in  cross 
section,  for  wood  will  often  crack  along  the  radial  walls  of  compartments. 


S.2  Lumbering 

When  trees  are  cut  into  boards,  there  are  typically  only  three  major  types  of  cuts: 
the  “stump”  cut;  the  “board”  cut;  and  the  “plywood”  cut  (see  Figure  2). 

Let  the  axis  of  the  trunk  or  branch  be  oriented  in  the  y-direction.  A  “stump” 
cut  is  simply  a  plane  whose  normal  vector  is  parallel  to  Y .  The  typical  “board”  cut 
is  a  plane  parallel  to  Y,  i.e.  parallel  to  the  axis  of  the  branch.  These  cuts  can  have 
rather  complicated  cross-sections,  as  illustrated  in  Figure  3.  The  “plyboard”  cut  is 
a  cylindrical  shell  or  “stripping”  parallel  to  the  cylindrical  structure  of  the  trunk  or 
branch.  Finally,  there  are  several  “odd  cuts”,  like  spheres  or  dowels.  Most  of  our 
renditions  used  board  cuts,  which  generate  the  most  common  wood  grain  patterns. 
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Cfrfsy  poisirs 

Figure  4  Simplified  model  of  the  three  major  components  of  a  tree’s  structure. 


4.0  The  Model 

Our  idealized  tree  has  three  structures:  a  series  of  cylindrical  shells  (cambium),  fine 
tubular  vessels  running  parallel  to  the  cylindrical  axis  (pores)  and  radial  compart¬ 
ments  (rays).  As  illustrated  in  Figure  4,  these  we  identify  as  the  “rings”,  “rays”, 
and  “pores”. 

The  “rings”  in  the  computer  model  were  created  by  concentric  cylinders  with  a 
fixed  shell  width,  but  having  a  gray  scale  gradient  across  the  shell.  The  pores  were 
simulated  by  randomly  distributing  tiny  vertical  tubes  throughout  the  modelled 
tree  trunk  (i.e.  the  nested  cylinders).  The  rays  were  modelled  as  radial  vertical 
rectangular  planes  of  bounded  radial  and  vertical  extent.  These  planes  were  ren¬ 
dered  darker  than  the  backgound  material,  spaced  at  roughly  20°  intervals. 

Although  wood  is  usually  light  brown,  some  varieties  like  “rosewood” ,  “cherry” , 
or  basswood  can  exhibit  a  range  of  pastel  colors.  Furthermore,  if  wood  is  stained 
or  o  oted,  it  can  be  of  almost  any  color.  Here,  we  generally  used  256  brightness 
le-'t  •  of  the  same  color.  One  part  red  to  one-third  green  to  one  fifth  blue  in  the 
C  >  monitors  produced  a  realistic  wood-brown. 
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5.0  The  Renditions  (Instant  Psychophysics) 

Each  rendition  can  be  characterized  by  the  plane  of  section  through  the  tree  trunk 
(or  branch).  With  the  Y-axis  of  the  coordinate  system  aligned  with  the  central  axis 
of  the  tree,  and  with  the  Z-axis  pointing  to  the  viewer,  any  plane  can  be  described 
by  its  normal  vector  \x,  y,  z}. 

Figure  5A  shows  a  “stump  cut”,  N  =  [  ?  1  .1],  overlaid  with  the  tapered  end  of 
a  dowel,  N  =  [-110].  For  the  stump  cut,  we  see  the  rings  manifest  as  circles,  with 
the  rays  appearing  as  radial  lines  and  the  pores  as  points.  We  have  also  added  a 
slow  radial  modification  in  albedo  to  model  longer-term  variations  in  density  due 
to  climate.  This  addition  breaks  up  the  otherwise  regular  geometric  pattern  and 
produces  a  more  acceptable  rendering.  As  an  aside,  we  note  here  that  the  particular 
waveform  of  the  annual  density  of  wood  is  not  very  critical  (i.e.  the  cross-sectional 
intensity  profile  of  the  rings).  Technically  the  most  dense  (and  hence  the  darkest) 
portion  of  the  ring  should  be  on  the  inner  surface  of  each  layer  of  the  cambium.  In 
practice,  the  typical  observer  ignores  this  subtlety — we  have  reversed  the  phase  of 
the  pattern  with  no  loss  in  rendition  quality. 
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The  remaining  figures  show  “board  cuts”  N  =  [a  0  6],  with  a,  b  taking  on 
slightly  different  values  for  each  of  the  five  panels  of  each  figure.  The  most  ob¬ 
vious  feature  of  all  these  panels  is  the  expected  elliptical  pattern  of  the  rings.  In 
Figure  IB,  only  the  rings  (plus  slow  climatic  modulation)  is  illustrated.  The  ren¬ 
dition  is  acceptable.  Adding  the  rays  greatly  improves  the  rendition  (Figure  1C). 
However,  breaking  up  the  regular  elliptical  pattern  by  modulating  the  cut  produces 
the  most  realistic  wood  grain.  Note  that  an  undulating  plane  of  section  is  equiv¬ 
alent  to  a  planar  section  of  a  tree  which  is  not  strictly  a  nested  stack  of  regular 
cylinders.  The  observation  that  the  presence  of  irregularities  is  a  necessary  part  of 
a  good  rendition  is  important.  It  means  that  the  inference  “wood”  requires  some 
evidence  that  a  strictly  regular,  machine-like  process  is  not  solely  responsible  for 
the  observed  structure. 

In  sum,  our  renditions  suggest  that  the  geometrical  model  consisting  of  nested 
cylinders,  radial  compartments,  and  pores  (which  create  a  fine-grained  “peppery” 
texture)  is  acceptable  provided  that  two  kinds  of  perturbations  are  included  in  the 
strictly  geometric  model:  1)  longer-term  climatic  variations  in  pigmentation  density 
and  2)  allowance  for  undulations  in  the  cylindrical  axes  and  diameter. 


6.0  The  Inference  Process 

A  wooden  surface  typically  exhibits  a  “grain  direction” ,  either  from  the  rays  or  from 
the  pores  or  both.  For  perfect  planar  cuts,  this  grain  direction  will  range  from  radial 
(stump  cut)  to  parallel  (board  cut).  The  major  axes  of  the  ellipses  of  the  growth 
rings  must  align  with  this  grain  direction,  all  intersecting  together  at  the  center  of 
the  tree.  In  addition,  the  spacing  between  the  elliptical  rings  must  be  the  same 
along  any  given  ray  (or  grain  direction).  Furthermore,  the  rings  must  exhibit  an 
albedo  variation  whose  function  is  normalized  for  the  different  separations  between 
the  rings  for  different  rays.  This  is  the  strictly  geometric  model.  In  addition,  some 
evidence  must  be  presented  that  the  pattern  is  not  simply  a  regular  geometric 
structure.  Slow  albedo  modulation  and  perturbations  in  the  shape  of  the  ellipses 
along  a  given  ray  seem  to  provide  convincing  evidence  of  a  natural  growth  process.1 


7.0  False  Targets 

One  simple  approach  to  enumerating  the  important  false  targets  is  to  recognize  that 
the  strictly  geometric  model  results  in  an  image  of  concentric  ellipses  with  equal 
radial  increments  from  one  to  the  next  ellipse.  We  now  have  four  free  parameters 
as  this  plane  is  moved  to  create  a  nesting  of  surfaces  in  3-space:  the  ratio  of  major 

1Note:  saw  cuts  may  also  create  density  variations  parallel  to  the  edge  of  the  boards. 
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to  minor  axis,  the  length  of  the  major  axes,  and  two  degrees  of  freedom  in  the 
direction  of  movement  of  the  common  axis  (i.e.  center  of  the  ellipsis). 

An  onion  falls  into  this  class  of  possible  generating  surfaces.  In  this  case,  the 
major  and  minor  axes  of  the  ellipse  are  equal  (circles),  and  the  plane  of  nested 
circles  is  moved  in  a  direction  perpendicular  to  the  original  plane  of  the  circles,  and 
the  major  axis  (radius)  is  decremented  as  we  move  away  from  the  initializing  plane, 
yielding  a  set  of  nested  circles  (or  ellipses  if  we  wish).  A  nesting  of  cones  also  could 
yield  a  false  target. 

To  exclude  these  possibilities,  we  may  require  further  evidence,  such  as  radial 
“cracks”  or  albedo  variations  in  the  ring  pattern.  But  the  key  reason  for  the  success 
of  the  inference  “wood”  from  the  image  slice  is  that  in  the  natural  world,  structures 
created  from  nested  cones  or  ellipses  are  very,  very  rare — indeed  it  is  difficult  to 
give  a  natural  example  of  such  a  structure!  A  nesting  of  spheres  is  possible,  such  as 
the  onion,  but  here  the  potential  for  confusion  is  only  with  the  less  common  “stump 
cut” .  Fortunately,  Nature  creates  structures  according  to  rules  which  tend  to  group 
similar  classes  together,  creating  “modes”  of  particular  structures  rather  than  a 
continuum  of  all  possible  functions  (Bobick,  1987).  The  cylindrical  construction  of 
plants  and  trees  is  an  example  of  such  a  “mode”  which  permits  successful  inferences 
from  the  pattern  we  recognize  as  “wood”. 


8.0  Summary 

One  might  ask,  was  computer  graphics  necessary  to  determine  the  characteristic 
features  of  wood?  Couldn’t  we  have  just  sketched  the  projections  of  the  main 
components  of  the  tree  projected  onto  the  image  plane,  just  as  an  artist  might  do? 

Certainly  an  artist’s  rendition  of  any  scene  is  informative  and  helpful.  But 
can  the  artist  tell  you  exactly  what  each  brush  stroke  represents  in  terms  of  a 
world  property?  Without  such  knowledge,  the  inference  from  image  to  world  is  not 
possible.  Furthermore,  the  artist  is  depicting  his  internal  model  of  the  material, 
which  is  not  necessarily  an  accurate  world  model — certainly  not  a  physical  model. 
Often,  certain  features  are  exaggerated  for  emphasis,  and  usually  only  common 
views  are  depicted.  The  graphics  model,  on  the  other  hand,  is  not  limited  to  special 
views  and  exaggeration  must  be  made  explicit.  Hence  the  graphics  model  can  be 
used  to  explore  unusual  viewing  or  illumination  conditions  to  determine  the  actual 
character  of  the  observer’s  internal  model.  Such  power  is  needed  if  the  observer’s 
representation  and  inference  process  is  to  be  understood  fully.  Further  examples 
of  this  point  follow  in  subsequent  sections,  where  unusual  viewing  conditions  of 
a  “correct”  physical  model  lead  to  mis-perceptions  because  the  observer’s  model 
assumes  particular  viewing  and  illumination  conventions. 
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4.2  Inferring  “Water”  from  Images  (T.J.  Kung) 


1.0  Introduction 

Water  waves  are  recognizable  from  a  black  and  white  photo.  What  is  the  image 
information  that  allows  us  to  make  this  inference?  In  order  to  answer  this  question, 
we  use  computer  graphics  to  simulate  water.  Once  a  perceptually  deceiving  sim¬ 
ulation  of  water  is  obtained,  then  the  study  of  the  inference  process  can  proceed. 
This  report  presents  a  model  for  generating  near-natural  images  of  water  waves. 

Computer  generated  water  waves  have  been  created  previously  for  flight  sim¬ 
ulators.  However,  these  images  did  not  look  very  realistic,  probably  because  very 
simple  models  were  used.  For  instance,  Max  (1981)  presented  a  ray-tracing  pro¬ 
cedure  in  which  ocean  waves  and  islands  are  rendered.  Although  he  was  able  to 
render  good  reflection  images  off  waves,  the  ocean  waves  do  not  look  like  real  waves. 
Ogden  (1985)  generated  a  real-time  animation  sequence  of  water  by  using  the  Burt 
Pyramid  (1983),  but  again  her  model  was  not  based  on  the  physics  of  water.  Fur¬ 
thermore,  because  the  method  essentially  generated  fractals  in  the  image  plane, 
there  was  no  real  description  of  the  3D  surface.  Here,  our  objective  is  to  create  a 
3D  surface  of  water  undergoing  wave  motion,  and  to  be  able  to  view  this  surface 
from  any  arbitrary  angle.  Complicated  effects  like  the  reflection  of  an  object  off  the 
water  surface  are  not  considered  in  this  paper. 

We  divide  the  problem  into  three  parts:  (1)  reflection,  (2)  surface  shape,  and 
(3)  dynamics.  In  interpreting  images,  basically  we  are  trying  to  infer  the  surface 
properties  and  shape  from  the  image  intensities.  As  a  first  step  in  understanding 
this  inference  process,  we  must  understand  what  properties  make  water  charac¬ 
teristically  different  from  other  surfaces.  If  we  knew  these  properties  and  how  to 
describe  them  mathematically  we  should  be  able  to  create  a  realistic  image.  Among 
the  surface  properties,  we  are  only  interested  in  the  optically  related  properties,  i.e. 
the  reflectance  function.  A  dynamic  view  yields  a  considerably  stronger  impression, 
therefore  a  brief  discussion  of  wave  dynamics  is  included  also. 


2.0  Reflectance  Function 

Because  water  is  a  transparent  material,  the  intensity  and  color  of  the  visible  light 
emerging  from  the  sea  depends  on 

(i)  The  illumination.  The  elevation  of  the  sun  and  the  sky  and  cloud  cover  deter¬ 
mine  the  quantity  and  spectrum  of  downwelling  incident  light. 
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Figure  1  Synthetic  water  images  are  shown  in  the  two  left-hand  panels.  The  up¬ 
per  right  panel  is  an  artist’s  rendition.  A  photographic  image  of  water  is  shown 
in  the  lower  right.  Most  people  prefer  one  of  the  synthetic  images,  demonstrating 
that  our  model  is  capturing  the  psychophysicaUy  relevant  information. 


(ii)  The  optical  properties  of  the  sea-water  itself.  The  light  which  penetrates  into 
the  sea  is  spectrally  modified  by  absorption  inside  the  medium  before  being 
partly  backscattered  toward  the  atmosphere.  These  absorption  and  backscat- 
tering  processes  have  their  origin  in  the  water  molecules,  and  also  in  all  the 
other  substances  present  in  the  sea  as  dissolved  or  particulate  matter. 

People  can  recognize  water  waves  from  a  black  and  white  photo,  therefore  the 
most  important  thing  in  perceiving  water  waves  is  the  relative  intensity  of  emerging 
light  from  different  places  of  the  water  surf  sice,  not  the  details  of  spectrum  and 
quantity.  This  fact  makes  it  possible  to  simplify  the  reflectance  function  without 
considering  the  functional  dependency  on  the  wavelength  of  lights. 

Cook  ic  Torrance  (1982)  proposed  a  reflectance  model  for  computer  graphics 
as  follows: 

f  =  ^amb^amb  d"  ^  ■®<iir(N  •  Ti)(dRmat  +  sRf pec)  (l) 

where  E  is  the  intensity  of  light  source,  N  is  surface  normal  unit  vector,  and  L 
is  the  light  direction  of  a  point  light  source.  The  point  light  sources  are  linearly 
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Figure  2  The  geometry  of  reflection. 

combined  with  the  ambient  light  source.  Each  point  light  source  is  further  divided 
into  a  specular  component  and  matte  component.  The  d  and  a  in  equation  (1)  are 
the  fractions  of  reflectance  that  are  matte  and  specular.  The  specular  component 
is  given  by: 

p  _  £  DG  m 

Rtpee  ~  r  (N  •  L)(L  •  V)  W 

where  Fisa  Fresnel  term  which  describes  how  light  is  reflected  from  each  smooth 
microfacet.  It  is  a  function  of  incidence  angle  and  wave  length.  The  geometrical 
attenuation  factor  G  accounts  for  the  shadowing  and  masking  of  one  facet  by  sm¬ 
other.  The  facet  slope  distribution  function  D  represents  the  fraction  of  the  facets 
that  are  in  the  direction  H  (Figure  2). 

In  rendering  water,  we  use  Cook  Ac  Torrance’s  model  with  some  modifications. 
As  discussed  before,  water  backscatters  part  of  the  incident  light.  Although  the 
backscattered  light  is  a  function  of  the  incident  light  spectrum,  optical  properties 
of  the  water  itself,  and  the  elevation  of  viewing  position,  we  assume  these  condi¬ 
tions  are  fixed  in  a  limited  area  and  within  a  short  period  of  time.  Therefore  the 
backscattered  light  is  a  constant  and  is  uniform  over  the  water  surface. 

Figure  3  shows  the  value  of  Fresnel  term  for  various  indexes  of  refraction  n. 
The  Fresnel  term  is  significant  only  when  the  incident  angle  is  greater  than  70 
degrees.  For  water  the  index  of  refraction  equals  1.32.  Based  on  this  property,  we 
make  two  simplifications.  First,  we  neglect  the  matte  term  because  a  water  surface 
only  reflects  light  in  very  limited  directions.  Second,  we  neglect  the  point  light 
source-sun.  By  doing  so  we  eliminate  the  special  cases  when  it  is  dawn  and  when 
it  is  dusk,  and  also  scintillation  effects  which  are  easy  to  mimic. 

The  ambient  light  source  (sky)  is  considered  as  a  hemi-sphere  light  source. 
No  matter  where  the  viewing  position  is,  it  is  always  possible  to  see  some  lights 
from  somewhere  in  the  dome  directly  reflected  off  the  water  to  the  eyes.  When 
the  viewing  angle  is  less  than  70  degrees,  so  is  the  incident  angle,  and  the  reflected 
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Figure  S  Fresnel  reflectance  as  function  of  incident  angle.  The  parameter  on 
the  curves  is  the  index  of  refraction,  n.  Note  the  change  in  scale  for  the  right 
portion  of  the  graph. 


light  is  much  less  than  when  the  viewing  angle  is  greater  than  70  degrees.  In  other 
words,  the  intensity  of  reflected  light  from  the  water  surface  is  a  function  solely  of 
the  angle  between  the  viewing  direction  and  the  surface  normal  of  the  facet. 

As  a  conclusion  of  above  discussion,  our  reflectance  model  for  water  waves  is 

f  FDG 

Itotal  I bt  "h  1 ami  J  ^  -y ^  (^) 

where  It ,  is  the  backscattered  light  intensity  and  is  a  constant.  The  integration 
term  is  for  the  ambient  light  source.  Since  this  term  is  a  function  of  the  angle 
between  N  and  V  only,  we  calculate  these  values  numerically  for  different  angles 
and  store  them  in  a  table.  In  rendering  the  images,  we  then  just  find  the  value  of 
this  integration  term  from  that  table.  In  order  to  simplify  the  integral  work,  the 
Gaussian  model  is  used  for  the  facet  slope  distribution  function  D 

D  =  exp  (-(a/m)2)  (4) 

where  a  is  the  angle  between  N  and  H  (Figure  2).  0.8  is  used  for  m.  With  this 
value  D  is  small  when  a  is  greater  than  20  degrees  and  then  the  integration  can  be 
neglected.  In  other  words,  the  integrated  region  can  be  reduced  from  a  hemi-sphere 
to  a  cone  of  20  degr  e  radius.  The  expression  for  G  is 
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G  =  min 


1, 


2(N  •  H)(N  •  V)  2(N  •  H)(N  •  L) 


(V-H) 


(VH) 


(5) 


The  cone  is  further  divided  into  three  areas  based  on  which  term  in  Eq.  (5)  is 
important.  The  full  expression  of  F  used  in  the  integration  term  is 

'2  f.  +  jft+Acifl  (6, 

[c(j  +  c)  +  l] 


jr.Ikzfi 1 

2  ( g  +  c )2 


where 


c=V  H 
g2  =n2  +  c2  -  1. 


For  water,  n  is  1.32.  The  ratio  of  4»/ Iamb  was  set  to  1/3  in  our  simulation. 


S.O  Surface  Shape 


Traditionally,  sinusoidal  functions  are  used  to  model  ocean  waves  (Bigelow  ic  Ed¬ 
mondson,  1952;  Gross,  1972).  There  is  a  strong  theoretical  basis  to  assume  that 
the  ocean  surface  is  formed  by  sums  of  long  crested  sinusoids: 


w(x,y)  =  ^  C  +  B*  cos  ^jL*  + +  ^ 


(7) 


Since  this  model  requires  the  sum  of  a  large  number  of  wave  forms  to  obtain  a 
realistic  picture,  it  is  not  practical  for  image  generation.  Schachter  (1980)  proposed 
a  narrow-band  noise  model  to  overcome  this  problem.  In  this  model,  it  is  assumed 
that  envelopes  and  phase  shifts  are  slowly  varying: 


w(z)  =  a(x)  *  cos  (fix  +  0(z)) 


(8) 


Although  this  model  has  the  advantage  of  being  able  to  simulate  different  textures, 
the  results  show  that  the  water  generated  by  this  model  does  not  have  the  shape 
randomness  that  real  water  does. 

We  propose  another  model  which  is  a  summation  of  squared  of  sine  waves  with 
slowly  varying  wave  length: 


w(z,y)  =  Yi  A*  +  8in  (  A(gTy)  (ZC08^  +  y  *>nff)  +  ^)) 


0) 


Here  we  assume  waves  traveling  in  the  x,  y  plane  and  0  is  the  angle  between  x-axis 
and  the  traveling  wave  direction.  The  ideal  wave  form  is  sinusoidal  but  a  lot  of  ’ 
factors  can  distort  waves  from  their  ideal  forms,  for  instance  winds,  nonlinearity, 
and  nonuniform  water  depth  for  shallow  water  waves  (Bascom,  1980).  Normal  wave 


20 


Richards  k  Ullman 


Inferences  from  Images  4.2 


Component  Number 

1 

2 

3 

4 

Mean  wave-length  £ 

25 

30 

37.5 

10.7 

16.7 

Amplitude  A 

0.3 

0.4 

0.4 

0.15 

0.25 

Modulation  Rate  r 

0.009 

0.013 

0.016 

0.0 

Angle  (dgr)  0 

110 

90 

65 

120 

60 

Table  1  Static  wave  parameters. 


crests  tend  to  go  steep  before  collapse.  Consequently,  the  distorted  wave  form  is 
steep  at  the  crest  and  flat  at  trough.  We  found  it  takes  less  computation  to  get  the 
distorted  wave  form  by  using  a  function  which  has  its  shape  close  to  the  distorted 
wave  form  such  as  the  sine  squares  than  by  using  Fourier  series  (summation  of 
sinusoidal  waves).  [See  Appendix  II.] 

Because  of  nonlinear  effects,  wave  trains  are  always  modulated  when  traveling 
in  a  field.  Instead  of  assuming  slowly  varying  amplitude  and  phase  shift  as  Schachter 
does,  we  assume  slowly  varying  wave  length.  The  amplitude  is  assumed  to  be  a 
constant  because  the  variation  is  too  small  to  be  noticed  in  short  distances. 

Nonlinear  instability  theory  tells  us  that  instability  causes  steep  gravity  waves 
to  evolve  into  the  three-dimensional  spilling  breaking  waves.  This  three  dimensional 
wave  form  is  the  normal  wave  form  we  see  in  an  open  sea.  To  simulate  the  three 
dimensional  wave  patterns,  it  is  necessary  to  have  at  least  three  components  in 
equation  (9).  It  was  found  that  to  generate  a  nice  looking  water  wave  image,  each 
component  should  travel  in  a  direction  about  20  to  30  degrees  apart  from  the  other. 
The  image  looks  even  better  with  one  more  minor  component  on  each  side  of  the 
three  major  ones  (with  amplitude  and  wavelength  about  one  third  of  the  average 
values  of  the  major  components).1  In  Table  1  these  directions  are  specified  by 
parameter  0  which  is  the  angle  measured  counterclockwise  from  the  horizontal  axis 
(x-axis).  After  testing  several  functions,  we  decided  to  use  the  following  function 
for  the  slowing  varying  wave  length 

v  '  1  +  r  sin(px  +  py) 

where  /  is  the  mean  wave  length  and  r  is  the  normalized  deviation  of  wavenumber, 
and  p  is  the  parameter  which  shows  how  slowly  the  wavelength  varies.  In  our 

texture  psychophysics  has  shown  that  only  5  or  6  components  are  required  to  mimic 
very  complex  sums  of  sinusoids  (Richards,  1979). 
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simulation  p  was  fixed  at  0.1.  The  actual  values  of  the  other  parameters  are  listed 
in  Table  1.  The  unit  for  wave  length  and  amplitude  is  the  pixel. 


4-0  Wave  Dynamics 

The  wave  dynamics  are  solved  from  Navier-Stokes  equation  (Currie,  1975).  With 
the  assumptions  that  the  Coriolis  force  are  negligible,  water  is  inviscid  and  incom¬ 
pressible,  density  is  constant,  and  water  motion  is  irrotational,  the  Navier-Stokes 
equation  can  be  reduced  to 

V2tf  =  0  (11) 

where  tp  is  the  velocity  potential.  There  are  two  boundary  conditions  at  the  air- 
water  interface.  The  kinematic  boundary  condition  says  that  particles  at  free  sur¬ 
faces  remain  on  the  free  surface  and  the  dynamic  boundary  condition  says  that  the 
forces  balance  at  free  surface.  One  more  boundary  condition  says  that  the  vertical 
component  of  fluid  velocity  is  zero  on  the  ocean  floor  or  bottom. 

The  surface  tension  can  not  be  neglected  from  the  dynamic  boundary  condition 
when  the  wave  length  is  less  than  7  cm.  These  kind  of  waves  are  just  the  ripples  on 
top  of  the  larger  gravity  waves.  By  gravity  wave,  we  mean  that  the  gravity  force  is 
the  only  restoring  force  for  a  wave  system. 

For  a  gravity  wave,  if  the  wave  length  A  satisfies 

h  >  0.28A  (12) 

where  h  is  the  ocean  depth,  then  this  wave  is  called  a  short  wave  or  deep  water 
wave.  In  this  case,  a  wave  travels  with  velocity 

C  =  ^  %/  ^A/2ir  (13) 

* 

Because  this  velocity  depends  on  wave  length,  longer  waves  travel  faster  than 
shorter  waves  and  the  wave  system  is  dispersive.  On  the  other  hand,  if  the  wave 
length  satisfies 

k  <  0.07 A  (14) 

This  wave  is  called  a  long-wave  or  shallow  water  wave.  In  this  case,  the  waves  travel 
with  the  same  speed 

C  =  -Jgh  (15) 
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Component  Number 

1 

2 

3 

4 

Mean  wavelength  l 

25 

30 

37.5 

10.7 

16.7 

Deep  water  (h  >  0.28A) 

0.9* 

1.0* 

1.1* 

0.58  * 

0.74* 

Shallow  water  (h  <  0.07A) 

1.0* 

1.0* 

1.0* 

1.0* 

1.0* 

Table  2  Phase  shift  *  between  frames  used  to  generate  dynamic  water  waves. 


Therefore  waves  are  nondispersive  in  shallow  water. 

In  order  to  have  am  idea  about  how  these  two  modes  of  waves  look,  we  simulated 
traveling  waves  in  both  cases  based  on  our  static  model  described  before.  We 
generated  40  static  pictures,  being  the  maximum  allowed  by  our  computer  memory. 
Each  picture  was  phase  shifted  from  the  previous  one.  For  deep  water  waves  this 
phase  shift  was  proportional  to  the  square  root  of  wavelength  and  for  shallow  water 
waves  it  was  independent  of  wavelength.  The  exact  values  of  phase  shift  we  used  are 
listed  in  Table  2.  We  cycled  these  pictures  sequentially  and  video  taped  the  whole 
procedure.  It  turned  out  that  the  deep  water  mode  looks  better  than  shallow  water 
mode.  This  result  is  consistent  with  our  daily  life  experience.  Except  for  shore 
waves  and  turbulent  streams,  we  hardly  see  a  pond,  river,  or  sea  with  water  depth 
less  than  a  quarter  of  the  average  wave  length. 


5.0  Conclusions 

A  procedure  for  generating  near-natural  water-wave  images  by  a  computer  has  been 
presented.  Three  factors  play  the  dominant  role  in  our  perception  of  static  water 
images.  The  first  is  the  Fresnel  term  in  the  reflectance  function.  The  second  is 
hemispheric  illumination.  Although  a  point  light  source,  such  as  the  sun,  was  not 
included  in  our  simulation,  this  is  obviously  the  key  factor  for  the  speckle  and 
scintillation  effects  of  water  (Longuet-Higgins,  1960).  The  shape  and  composition 
of  the  wave-itself  was  also  found  to  be  important.  For  dynamic  water-waves,  it  is 
necessary  to  include  non-linear  terms  that  mimic  the  behavior  of  deep  water  wave 
systems. 


6.0  Appendix  I:  Image  Features 

In  the  main  body  of  this  paper  we  used  computer  graphics  to  determine  which 
physical  properties  were  significant  factors  in  creating  images  that  looked  like  water. 
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Figure  4  The  top  panel  shows  a  cross-section  through  a  surface  formed  by 
two  orthogonal  sinusoids  of  equal  wavelength  and  amplitude.  The  cutting  plane 
contains  the  line  of  sight  and  two  vertical  surface  normals,  and  hence  is  a  special 
view.  The  cross-hatched  region  will  appear  as  a  darker  “blob*,  defined  by  an 
isoluminance  contour  corresponding  to  an  emittance  angle  of  60°.  In  the  lower 
panel,  the  surface  normals  are  mapped  on  the  Gaussian  sphere,  with  the  viewer 
direction  normal  to  the  plane  of  the  page.  The  line  of  zero  Gaussian  curvature 
is  shown  as  dashed  (<c  =  0). 


Ignoring  dynamics,  these  were  i)  Fresnel  reflectance,  ii)  hemispheric  illumination, 
and  iii)  a  wave  packet  composed  of  (non-linear)  sinusoids  of  low  amplitude.  Here 
we  proceed  with  the  second  step  of  our  graphics  approach  to  image  understanding, 
namely  identifying  an  image  feature  characteristic  of  “water”  or  other  fluids  in 
motion . 

The  top  half  of  Figure  4  shows  a  cross-section  of  a  water  wave  viewed  from  an 
angle  of  about  20°.  The  visual  ray  is  tangent  to  the  wave  at  R,  which  is  a  point 
on  the  “rim”  of  the  surface  (Koenderink  k  van  Doom,  1982).  The  locus  of  all  such 
rim  points  will  be  a  great  circle  on  the  Gauss  map.  If  we  now  place  the  viewer 
direction  normal  to  the  page  (bull’s-eye),  then  the  rim  will  be  the  outline  of  the 
Gaussian  sphere,  as  illustrated  in  the  lower  half  of  the  same  figure. 

Consider  now  the  isoluminance  contour  of  5%  reflectance  on  the  wave  surface. 
For  a  Fresnel  reflectance  function,  this  corresponds  to  a  60°  viewing  angle.  We 
wish  to  determine  the  general  shape  of  this  contour,  and  hence  of  the  dark  region 
it  encloses.  Referring  to  the  Gauss  map,  the  region  of  60°  Fresnel  reflection  will  be 
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described  simply  by  a  circle  centered  about  the  bull’s-eye,  assuming  hemispheric 
illumination  without  wavelet  occlusions. 

To  determine  which  part  of  the  60°  Fresnel  circle  is  visible,  consider  the  locus 
on  the  sinusoidal  surface  which  has  zero  Gaussian  curvature  ( k  =  0).  For  our 
surface  waveform  of  crossed  sinusoids,  this  locus  can  be  shown  to  be  a  distorted 
square  (or  rectangle)  on  the  Gauss  map,  with  the  square  centered  about  N.  The 
region  contained  by  both  the  Fresnel  circle  and  the  n  =  0  square  thus  represents 
the  surface  orientations  of  the  dark  “blob”.  On  the  Gauss  map,  this  shape  will  have 
cusps  at  the  extremal  views.  (Obviously  the  exact  choice  of  cut-off  for  the  angle  of 
incidence  is  not  critical.) 

If  the  “blob”  region  on  the  Gaussian  sphere  is  now  mapped  back  onto  the 
surface,  we  obtain  two  dark  regions  abutting  at  the  locus  k  =  0.  We  wish  to 
show  that  the  extremal  ends  of  this  region,  namely  where  the  surface  normal  has 
orientation  M,  will  be  seen  as  “cusped”.  (A  cusp  on  the  Gauss  map  in  itself  is  not  a 
sufficient  condition  for  a  cusp  on  the  surface.)  One  approach  is  to  solve  analytically 
for  the  isoluminance  contour  and  show  that  the  first  derivatives  are  discontinuous 
here.  Alternatively,  we  note  that  as  we  move  toward  M  along  the  blob  contour,  for 
there  not  to  be  a  cusp  at  M,  the  isoluminance  contour  must  be  perpendicular  to  the 
parabolic  line  k  =  0,  because  M  is  an  extremal  point  of  the  contour.  But  generally 
this  will  not  be  the  case,  because  the  behavior  of  the  isoluminance  contour  and  the 
k  —  0  contour  are  controlled  by  independent  processes,  and  hence  generally  will 
intersect  transversally  as  they  do  on  the  Gauss  map.  The  extremal  end-points  of 
the  dark  “blobs”  of  water  should  thus  be  cusped.  (Close  inspection  of  photos  of 
water  confirm  this  prediction.) 

In  the  above,  our  analysis  assumed  a  special  viewing  position  and  a  special 
waveform.  First,  it  should  be  obvious  that  the  cusped  property  of  the  blob  will 
remain  unchanged  if  the  wavelength  of  either  one  of  the  sine  waves  is  increased 
(except  in  the  limit  as  the  wavelength  becomes  infinite).  Similarly,  we  expect  no 
effect  of  small  non-linearities  of  wave  shape.  However,  it  is  clear  that  viewing  angle 
can  alter  the  geometry  of  the  blob  drastically,  for  example  when  the  surface  normal 
N  approximates  the  viewer  direction,  such  as  when  we  look  almost  directly  down 
at  water.  In  this  case  the  tangent  to  k  =  0  at  M  may  lie  in  the  plane  of  the  view 
vector,  causing  the  cusp  angle  to  reach  x/2,  thereby  eliminating  the  discontinuity 
at  M.  The  cusped  feature  for  water  is  thus  guaranteed  only  for  shallow  (<  30°) 
viewing  angles.2 

One  final  point  can  be  made  about  the  shape  of  these  dark  “blobs”  character¬ 
istic  of  water.  Once  the  amplitude  of  a  wave  is  such  that  its  bitan  gents  intersect  at 

2Note  that  for  very  shallow  view  angles  the  lower  portion  of  the  dark  blob  of  one  wave 
will  be  occluded  by  the  bright  ridge  of  the  next  nearer  wave,  also  resulting  in  cusped 
ends,  or  even  in  an  extra  pair  of  cusps,  depending  upon  the  view  angle. 
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Figure  5  A  “zoom”  photo  of  a  water  scene,  showing  clearly  the  predicted  cusps 
and  image  features  (credit  for  photograph  -  unknown). 


less  than  120°  (see  Figure  4),  then  the  wave  will  crest.  Because  the  wave  silhouette 
(rim)  sets  bounds  on  the  shape  of  the  blob,  it  too  must  be  elongated  sufficiently  in 
order  not  to  violate  this  constraint.  Our  characteristic  image  feature  for  water  will 
thus  be  a  very  elongated,  horizontal  elliptical-like  blob  having  cusped  end-points  at 
its  extremal  viewing  positions.  It  is  of  interest  to  note  that  Ogden  (1985)  created 
an  illusion  of  a  water  surface  by  convolving  with  image  noise  a  mask  consisting  of 
two  such  blobs  of  opposite  sign  (simply  by  splitting  an  ellipse).  Similarly,  properly 
chosen  packets  of  Gabor  filters  can  also  generate  “water-like”  textures  by  (thresh- 
olded)  convolution  with  noise  (Daugman,  1988).  Both  of  these  masks  create  image 
patterns  with  dark,  elongated,  cusped  blobs. 

7.0  Appendix  II:  Wave  Shape 

Here  we  illustrate  the  psychophysical  effects  of  three  different  choices  for  the  shape 
of  water  waves. 

In  the  top  row  of  Figure  6  the  non-linearity  in  the  shape  of  each  wave  is  created 
simply  by  squaring  a  sinusoid  (this  was  our  final  choice). 
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Figure  6  The  psychophysical  effect  of  the  shape  of  the  water  wave  upon  its 
appearance,  using  Fresnel  reflection  and  hemispheric  illumination. 


For  comparison,  the  next  row  uses  a  square  of  a  triangular  wave  plus  its  second 
harmonic.  Note  the  negligble  difference  in  the  appearance  of  the  two  types  of  waves 
under  fresnel  reflection  and  hemispheric  illumination  (top  and  middle  right). 

Finally,  the  last  row  shows  the  appearance  of  water  if  only  the  linear  sinusoid 
or  the  fundamental  component  of  the  triangular  waveform  are  used.  Neither  of 
these  renditions  looks  as  good  as  the  non-linear  choices  above. 
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5.0  A  New  Reflectivity  Model 

Spectral  reflectance,  or  more  properly  albedo  p[X),  is  probably  the  simplest,  and 
certainly  the  most  colorful  surface  parameter  in  the  image-intensity  equation.  But 
in  addition,  we  must  also  consider  R — the  reflectance  function  itself. 

I  =  p(X)  ■  E(X)  ■  (N  ■  L)  ■  R  (1). 
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Figure  1  The  underlying  surface  is  two  orthogonal  sinusiods.  The  illumination 
and  reflectance  conditions  can  drastically  alter  the  inferred  shape  and  material 
type.  Note  that  special  conditions  are  required  to  create  an  image  that  looks 
like  water  (lower  right). 
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Equation  (1)  summarizes  again  our  basic  relation  between  surface  parameters, 
p(A),  N\  the  illumination  E{ A),  L,  and  image  intensities.  Cook  Sc  Torrance  have 
elaborated  this  equation  in  detail,  with  special  emphasis  upon  the  reflectance  pa¬ 
rameter,  R.  Here,  the  matte  versus  specular  aspects  of  the  surface  are  collected,  as 
well  as  textural  effects.  Following  Torrance  Sc  Sparrow  (1967),  the  textural  com¬ 
ponent  of  the  surface  is  considered  only  at  the  optical  level,  so  the  micro-texture 
itself  is  not  visible  [although  see  Blinn  (1978)  for  a  treatment  of  visible  textural  ef¬ 
fects].  Although  the  interaction  between  the  illumination  conditions  and  reflectance 
parameters  is  obvious  from  equation  (1),  the  full  force  of  this  interaction  did  not 
strike  us  until  we  began  to  model  water.  Furthermore,  as  shown  in  Figure  1,  the 
visible  surface  structure,  not  just  the  invisible  microfacet  distribution,  is  critical  to 
how  we  infer  surface  material  type.  In  this  figure,  the  underlying  surface  structure 
for  all  six  panels  in  the  top  two  rows  is  simply  two  orthogonal  sinusoids  at  90°  to 
each  other.  The  upper  left  panel,  which  uses  Phong  shading  with  some  diffuse  (i.e. 
matte)  illumination,  looks  roughly  “correct”,  as  does  the  Cook  Sc  Torrance  rendi¬ 
tion  using  an  index  of  refraction  of  10.  Note  the  importance  of  adding  the  matte 
component  to  a  surface,  for  when  this  is  taken  away  (upper  right),  the  underlying 
surface  can  not  be  seen.  Note  also  that  the  choice  of  index  of  refraction  is  not  crit¬ 
ical  as  long  as  there  is  a  dominant  matte  (diffuse)  reflectance  component  (compare 
left-most  in  second  row  with  top-middle  in  first  row).  The  refractive  index  does, 
of  course,  critically  affect  the  specular  reflectance  (middle,  second  row  versus  top 
right,  first  row).  Finally,  rather  than  having  just  a  single  light  source  illuminating 
a  completely  specular  surface  (top-right  or  middle,  second  row),  let  the  source  oc¬ 
cupy  the  entire  hemifield  (right-most  in  second  row).  The  result  looks  more  like 
ribbon  candy  or  an  E-M  photo  than  crossed-sinusoids.  Yet  once  the  amplitude  of 
the  sinusoids  is  reduced  and  made  typical  of  water,  the  surface  indeed  begins  to 
look  like  a  common  one — namely  water  (lower  right).  Clearly  there  is  an  important 
interaction  between  the  visible  structure  of  the  surface,  its  reflectance  properties, 
and  the  illumination  typical  for  that  surface. 

To  capture  this  perceptual  effect,  we  have  recently  proposed  a  new,  psycho- 
physically-based  reflectance  model  that  makes  the  visible  surface  roughness  an  ex¬ 
plicit  parameter  of  the  representation  of  material  type  (Richards,  1988). 

The  basic  idea  is  that  facets  on  the  surface  may  appear  over  a  range  of  scales, 
some  visible,  some  not,  depending  upon  the  viewing  distance.  What  is  needed  is 
a  parameter  that  captures  this  surface  property.  One  natural  choice  is  a  fractal 
measure.  The  new  representation  includes  the  parameter  h  (roughness),  and  q 
(specular  component),  as  illustrated  in  Figure  2.  Each  takes  on  values  between  0 
and  1.  For  example,  whenever  p,  h  or  q  is  zero,  then  the  surface  will  appear  black, 
for  then  (i)  either  the  albedo  p  is  0,  or  (ii)  the  surface  is  so  rough  that  no  light  can 
escape  to  the  viewer  (a  black  body!),  or  (iii)  the  surface  is  a  perfect  mirror  with 
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Figure  2  Representation  for  reflectance.  Axes  are:  albedo,  p;  Hansdorf  dimen¬ 
sion  less  1,  h,  and  a  measure  of  the  index  of  refraction,  rj,  scaled  to  be  between 
0  and  1. 

index  of  refraction  r)  =  0,  where  light  can  not  be  reflected  to  the  viewer  except 
under  very  special  conditions  (such  as  retro-reflect  materials).  Intuitively,  surfaces 
with  a  high  fractal  dimension  (large  Hausdorf-Besicovitch  number)  will  be  “rough” 
and  hence  will  be  dark  regardless  of  albedo  (p)  and  refractive  index  (r>).  Whereas 
surfaces  with  a  low  fractal  dimension  ( H  =  1  -  h)  will  be  smooth,  and  can  be 
either  dark  or  light  depending  upon  (p)  or  (rj).  This  insight  gives  new  meaning 
to  “facet  distribution”  for  both  matte  (p)  and  specular  (»7)  reflectance.  As  shown 
by  Pentland  (1983,  1984,  1988)  this  measure  can  be  recovered  easily  from  image 
information  for  matte  surfaces.  Appendix  I  extends  his  proof  to  glossy  surfaces. 
Hence  a  fractal  facet  distribution  function  can  be  estimated  from  images,  at  least, 
for  surfaces  which  approximate  fractals — a  very  large  class.2 

Thus,  in  our  formulation  for  the  reflectance  function,  all  the  matte  properties 
of  the  surface  are  captured  by  the  albedo  parameter,  p.  All  the  specular  properties 
thus  appear  in  the  reflectance  function,  R,  whose  two  major  parameters  are  rj  (in¬ 
dex  of  refraction)  and  h,  which  describe  the  surface  roughness.3  The  recovery  of 
the  surface  material  would  require  estimating  these  three  parameters  from  image 
intensities.  As  already  noted,  this  is  a  difficult  task  in  the  general  case  because  of 
the  complexity  of  the  full  image  intensity  equation  (1).  However,  in  many  cases, 

2As  an  alternative  to  the  fractal  measure,  see  Lewis,  1987. 

3Note  also  that  h  may  have  an  orientation  component,  such  as  water  does  (i.e.  approxi¬ 
mately  1-D)  whereas  clouds  are  typical  of  a  2-D  fractal  function  (see  frontispiece).  This 
aspect  of  surface  texture  needs  further  study. 
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conditions  or  surfaces  exist  where  the  image-intensity  equation  becomes  quite  sim¬ 
ple.  For  example,  particular  locations  on  the  surface  (occluding  edge,  bright  spot, 
hyperbolic  region,  etc.)  may  each  provide  partial  information  which  can  be  assim¬ 
ilated.  Alternately,  some  common  surfaces  like  water  or  clouds  have  very  special 
properties  leading  to  quite  simple  reflectivity  functions  for  certain  very  common  il¬ 
lumination  conditions  such  as  extended,  hemispheric  illumination  or  direct  sunlight 
(see  Section  4.2). 
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6.0  Dynamical  Properties 

Most  surfaces  appear  in  motion  on  our  retina,  either  because  of  our  own  movements 
or  because  the  surface  itself  is  moving.  There  is  a  large  body  of  work  analysing  the 
image  correlates  of  the  motion  of  rigid  objects  or  surfaces  (see  review  by  Ullman, 
and  Section  5.2  and  5.3).  Much  less  attention  has  been  given  to  the  motion  of 
non-rigid  objects  [although  see  Ullman  (1984),  Grzywacz  &  Hildreth  (1987)  and 
Koenderink  ic  van  Doom  (1986)].  Even  these  studies  have  not  considered  the  actual 
physics  of  the  deformations  underlying  surfaces  like  flags  waving  in  the  breeze,  or 
fluid  motions  such  as  clouds,  oil  or  water.  (One  exception  is  Kajiya’s  (1984)  study 
of  cloud  formation.)  Here  we  begin  an  analysis  of  the  dynamics  of  water-wave 
motion.  Then  we  conclude  with  a  more  traditional  analysis:  the  optical  flow  of 
planar  surfaces.  Although  this  study  is  for  a  rigid  surface,  the  ambiguity  is  relevent 
to  understanding  the  appearance  of  water  wave  motion,  whose  image  structure 
looks  like  dark  features  moving  in  concert  across  the  image  plane.  A  later  study 
of  the  relation  between  the  smoothness  of  the  (image)  velocity  field  and  surface 
rigidity  is  also  relevant,  reported  here  in  Section  5.3.  This  second  Btudy  shows 
that,  in  a  structural  sense,  smoothness  of  the  velocity  field  follows  from  rigidity  of 
motion.  Water  waves  in  motion  present  a  smooth  velocity  field,  yet  the  surface  is 
certainly  non-rigid.  More  work  is  needed  to  clarify  this  ambiguity. 

6.1  Non-Linear  Wave  Interaction  Between  Long  and  Short 
Waves  (T-.J.  Kung) 

1.0  Introduction 

In  order  to  understand  the  appearance  of  waves  in  an  image,  we  explore  here  the 
origin  and  behavior  of  short  waves  which  are  less  than  40  cm.  These  waves  are 
clearly  visible  not  only  to  our  eye  but  also  to  SAR  (Synthetic  Aperture  Radar). 
Surprisingly,  SAR  images  have  revealed  a  wealth  of  ocean  phenomena,  some  of 
which  are  correlated  to  local  wind  structure,  long  gravity  waves,  and  large-scale 
circulation  patterns.  Such  findings  require  a  consideration  of  the  interactions  and 
energy  transfer  between  wave  components  having  both  long  and  short  wavelengths. 

We  begin  by  considering  infinitesimal  surface  water  waves,  where  the  differen¬ 
tial  equation  to  be  satisfied  by  the  velocity  potential  is 

4>xx  +  4>x*  =  0  (l) 

with  boundary  conditions 
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4>z(x,  o,  t)  =  — cos  — (z  —  e<) 

Mx>  o,  t)  +  g<t>z{x,  o,  t)  =  0  (2) 

4>x(z,  —k,  t)  =  0 

where  x-axis  is  on  the  undisturbed  free  surface,  z-axis  is  perpendicular  to  the  undis¬ 
turbed  free  surface,  g  is  acceleration  of  gravity,  and  h  is  the  depth  of  water.  Solving 
the  above  equations,  we  obtain  the  dispersion  relations 

u2  =  ^Jktanh(JkA)  (3) 

For  the  case  of  deep  water,  A  and  tanh(Jkh)  «  1. 

u2  =  gk  or  c  =  ^  =  y/gjk  (4) 

This  is  a  dispersion  wave  system.  The  other  limit  is  that  of  shallow  water,  h  A 
and  tanh(fcA)  «  kh,  therefore 

u2  =  k2gh  otc  =  ~=  sfik  (5) 

This  is  the  non-dispersive  system.  The  above  analysis  is  based  on  the  assumption 
that  the  wave  amplitude  a  is  small  compared  with  either  depth  h  or  wave  length 
A  whichever  is  smaller.  Therefore  the  boundary  conditions  can  be  simplified.  This 
analysis  is  actually  an  approximation  of  a  nonlinear  problem  in  which  the  nonlinear 
terms  are,  in  some  sense,  small.  Here,  we  want  to  examine  how  the  nonlinear  terms 
may  affect  a  wave  system  through  the  mechanism  of  wave-wave  interaction. 

By  way  of  background,  let  us  consider  a  very  elementary  property  of  oscilla¬ 
tion  systems.  A  linear  oscillator  subjected  to  an  infinitesimal  forcing  function  is 
described  by  the  equation 

xtt  +  k?z  =  ee*nt  (6) 

where  k  is  the  natural  frequency  of  the  system  and  fi  the  frequency  of  the  forcing 
function.  The  response  of  the  system  from  an  initial  state  of  rest  is  small  (of  order 
e)  unless  k2  is  very  newly  equal  to  fi2  — that  is,  unless  there  is  resonance  between 
the  frequency  of  the  forcing  function  and  the  natural  frequency  of  the  system.  If 
the  two  are  precisely  equal,  the  amplitude  of  the  oscillation  grows  linearly  with  time 
and  becomes  arbitrarily  large.  Indeed,  the  only  way  that  the  response  can  become 
large  is  through  the  phenomenon  of  resonance. 

A  very  similar  effect  is  involved  in  the  interaction  among  wavetrains.  Let  us 
consider  two  interaction  wavetrains,  given  to  a  first  approximation  by  solutions  to 
the  linear  equation. 
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oiexpt^  f-wiO,  ^ 

<12  exp  <(*2  •  Z  —  W2<)» 

where  the  individual  wavenumbers  and  frequencies  are  related  by  the  dispersion  re¬ 
lation  u>i  =  Ui(Ki),  i  =  1,  2.  The  solution  to  the  complete  nonlinear  equation  would 
be  expected  heuristically  to  be  of  the  same  general  form,  though  the  amplitude 
‘a’  may  possibly  vary  with  time.  We  might  seek  solutions  to  nonlinear  equation 
by  substituting  expressions  of  the  type  (7)  into  the  small  nonlinear  terms.  If  the 
lowest-order  nonlinearity  is  quadratic,  this  substitution  leads  to  expressions  of  the 
type 

expt^(£}  ±  Hi)  ■  2  —  (<*>i  ±  W2)tj.  (8) 

Now,  these  terms  act  as  a  small  amplitude  forcing  function  to  the  essentially  linear 
system  and  provide  an  excitation  at  the  wavenumbers  (*i  ±  k2)  and  frequencies 
(u>i  ±cli3).  The  response  of  the  system  to  this  forcing  function  can  be  expected  to  be 
small  (that  is,  of  order  e)  unless  resonance  occurs — that  is,  unless  the  waver  iber 
and  frequency  at  which  the  forcing  is  applied  correspond  to  a  (wavenumb  fre¬ 
quency)  pair  of  a  natural  wave  mode.  If  they  do  correspond,  we  have  resonance  and 
growth  of  this  new  component  with  continuous  energy  transfer  from  the  two  original 
wave  components  to  this  third  one.  If  the  initial  amplitude  of  the  third  component 
is  zero,  then  under  resonant  conditions  it  grows  linearly  until  the  energy  drain 
from  the  other  components  begins  to  reduce  their  amplitudes  and  consequently  the 
amplitude  of  the  forcing  function. 

The  above  investigation  is  for  discrete  waves.  A  lot  of  research  has  studied  the 
interaction  phenomena  between  waves  having  widely  different  length  scales.  Since 
the  nonlinearity  is  weak,  it  takes  a  long  time  to  see  zeroth  order  changes  of  the 
system.  However,  when  the  group  velocity  of  a  wave  packet  is  equal  to  the  velocity 
of  a  natural  mode  wave,  we  expect  to  see  dramatic  interchange  of  energy  between 
them  just  like  the  case  of  discrete  wave  system.  Let  us  consider  the  infinitesimal 
surface  water  wave  again.  If  we  have  a  pond  with  uniform  depth  h,  then  a  wave  with 
wave  length  A  larger  than  2 h  is  considered  a  long  wave  which  travels  at  velocity 
y/gK  (naturally  defined  by  the  system).  On  the  other  hand,  we  can  also  have  a 
wave  packet  in  which  the  wave  lengths  are  always  smaller  than  h.  This  wave  packet 
is  a  deep  water  wave  (short-wave)  system  corresponding  to  this  pond  therefore  it 
travels  at  group  velocity 

Cn  =  =  -  \f^  (in  linear  case) 

u  &K  2  V  <c 

If  the  wave  number  of  the  short  wave  is  right,  the  group  velocity  can  be  equal  to 
y/gK,  the  long  wave  velocity.  This  project  is  to  study  the  phenomena  of  resonance 
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when  the  group  velocity  of  capillary  waves  is  equal  the  velocity  of  long  waves.  The 
method  of  multiple  scales  has  been  proved  to  be  very  effective  in  studying  wave 
interactions  and  will  be  used  in  this  analysis. 


2.0  Theory 

For  the  case  of  a  progressive  gravity-capillary  wave  moving  on  the  free  surface  of 
a  liquid  of  constant  depth  h,  the  undisturbed  free  surface  corresponds  to  the  plane 
z  =  0,  where  z  points  vertically  upwards,  and  the  bottom  is  located  at  z  —  -h. 
The  remaining  Cartesian  co-ordinates  x)  y  are  in  the  plane  of  the  undisturbed  free 
surface,  and  we  choose  x  to  point  in  the  direction  of  the  wave  propagation.  Since 
the  fluid  motion  is  irrotational,  a  velocity  potential  y,  z,  t)  satisfying  Laplace’s 
equation 

<f>xx+<t>yy+<f>xz=  o  -h<z<£.  (12) 

can  be  defined,  where  £(x,  y,  t)  denotes  the  position  of  the  undulating  free  surface. 
The  boundary  conditions  for  the  motion  are 

4>z  =  0  at  z  =  -h.  (13) 


4>z  =  (t  +  4>xtx  +  tvtv  at  *  =  i 


(n) 


+  +  l (*1  +  ^  +  &  +  M1  ^ ~ Hnkiz  ( 


(i  +  ii  +  «) 


(15) 


Equation  (13)  is  the  boundary  condition  for  the  bottom  where  the  normal  velocity 
is  zero.  Equation  (14)  is  the  kinematic  bounde’-y  ccndi*;on  at  top  surface  and 
equation  (15)  is  the  dynamic  boundary  condition.  The  parameter  T  is  the  ratio 
of  the  surface  tension  coefficient  to  the  fluid  density  and  g  is  the  gravitational 
acceleration.  We  suppose  that  initially  (at  t  =  0)  the  surface  is  distorted  in  the 
manner 


t(z>  y,t  =  0)  =  { A(«,  ty)eiKX  -  }  (16) 

0(1  +r)  >•  > 


where  T  =  K2T/g,  the  asterisk  denotes  the  complex  conjugate  and  e  is  a  nondimen- 
sional  parameter  measuring  the  slope  of  the  wavy  surface,  which  has  wave  length 
2 x/k.  The  envelope  A(e,  e)  of  the  surface  distortion  is  allowed  to  possess  a  slow 
spatial  variation  and  the  frequency  is  determined  uniquely  by  the  value  of  k  through 
the  dispersion  relation 


u 


j?*<T(l  +7)}’ , 


(17) 
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where  a  =  tanh(fcA). 

We  now  derive  the  equations  describing  the  time  evolution  of  A  when  the 
motion  is  only  weakly  nonlinear  (0  <  t  €  1).  On  this  basis  we  assume  that  (12) 
through  (16)  have  a  solution  of  the  form 

4>  =  +  eV2)  +«V3)  +•••  (18) 

€-««<»> +«*€**>+«*€<*>  +  .■ ••  (19) 


Also,  we  introduce  the  multiple  scales 

£  =  «(*-  cgt),  n  =  tV,  r  =  1 2t,  £i  =  e2(x  -  cgt)  (20) 


eg  denotes  the  group  velocity  and  is  given  by  the  relation 


eo 


6w  f 


er  +  kh(  1  —  g2) 
2(7 


IT?)- 


u 

Cn  — 


(21) 


Substituting  these  forms  into  the  governing  set  of  equations,  solving  successively 
the  equations  resulting  from  repeated  use  of  the  limit  process  e  — ►  0,  with  x,  y,  z, 
t,  £,  r)  and  r  fixed,  and  using  the  notation 

E  =  exp{»'(«x  -  wt)}  (22) 

one  obtains  the  following  results: 

iAr  +  ^A(f  +  fiAfjrt  =  +  i/\AQ  (23) 

(gh  —  Cg)Q((  +  ghQVn  =  *(|A|2)i,rf  (24) 

where 

Equation  (23)  shows  that  the  short  wave  satisfies  the  self  modulated  nonlinear 
Schrodinger  equation  plus  an  extra  term — the  interactions  between  short  waves 
and  long  waves,  while  the  modulation  of  long  waves  is  totally  excited  by  short 
waves  as  can  be  seen  in  equation  (24).  The  coefficient  v  of  the  cubic  nonlinear 
term  is  singular  when  c9{k )  =  (gh)1^2.  This  corresponds  to  a  long-wave/shorc- 
wave  resonance  in  which  the  group  velocity  of  the  short  (capillary)  wave  matches 
the  phase  velocity  of  the  long  (gravity)  wave.  Equation  (23)  and  (24)  break  down 
under  this  condition  and  a  different  analysis  and  scaling  are  required. 

Djordjevic  ic  Redekopp  demonstrated  that  the  long-wave/short-wave  resonant 
interaction  occurs  on  a  much  shorter  time  scale  than  that  for  self- modulation.  They 
restricted  their  discussion  to  the  one-dimensional  case  and  introduced  the  multiple 
scales 
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i  =  £2/3(z  -  cgt),  r  =  €A>\  6  =  £4/3(x  -  cgt), . . .  (26) 

The  evolution  equations  describing  this  interaction  turned  out  to  be 

iAr  +  XA^^  —  BA  (27) 

Br  =  -Q(|A|2)e,  (28) 

where 

B((,  r)  =  8*{0)  (29) 

In  this  project  we  extend  the  analysis  of  Djordjevic  ii  Redikopp  to  two-dimen¬ 
sional  case.  The  multiple  scales  need  to  be  reexamined.  It  is  assumed  that  <f>  ~ 
(l//i)  and  n  1  since  long-wave/short-wave  resonant  interaction  occurs  on  a 
shorter  time  scale.  Considering  equation  (23),  the  second  term  on  left  hand  side 
and  last  term  on  right  hand  side  gives  the  relationship  A((  ~  A<j>(  or  A ^  ~  A<p. 
Therefore  we  can  write  the  new  streamwise  direction  scale  in  terms  of  old  streamwise 
direction  scale  a s  ('  =  £//*.  From  first  and  second  term,  Ar  ~  A((.  The  new  time 
scale  is  related  to  the  old  time  scale  by  r'  =  r/n2  because  r'  ~  (£')2  =  (£/m)2  = 
r/ji2.  When  resonance  occurs,  e2  =  gh,  the  first  term  of  equation  (24)  vanishes. 
The  next  dominate  term  is  c<fi(T  and  it  has  the  same  order  of  magnitude  as  the 
term  on  right  hand  side  of  the  same  equation.  We  obtain 

and 

=  <2/3(*  -  Cgt). 

r'  =  r /«*/»  = 

Equalize  the  order  of  second  term  with  the  rest  of  the  terms  in  equation  (24),  we 
find  that  the  scale  for  cross-stream  direction  remains  the  same,  i.e. 

v'  =  *1  =  ey- 

As  a  conclusion,  our  multiple  scales  looks  like 

(  =  £2/3(X  -  Cgt),  <1  =  £4/3(x  -  Cgt),  .  .  . 

T  =  «4/3t 
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Figure  1-4  Numerical  solutions  for  ID  equations  (27  and  28),  showing  short 
and  long-wave  envelopes. 


Based  on  new  scales,  we  derived  the  evolution  equations  describing  two-dimen¬ 
sional  long- wave/short  wave  interaction  as  following; 


iAr  +  AA^  —  SAj>{  (32) 

4>It  +  Ptw  = -g^A]2)^  (33) 

If  we  substitute  with  /?  as  in  1  -  D  case,  i.e.  define  £(£,  tj,  t)  =  6<f>(,  we  get 


xAr  +  A  A**  =  BA 


U 

oo 


—  — q(|A|2){ 


(34) 

(35) 


A  computer  program  is  developed  to  solve  this  set  of  equations  numerically.  The 
results  are  shown  in  next  section. 
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S.  0  Results 

(A)  One-dimensional  case. 

Some  interesting  properties  about  the  1-D  evolution  equations  [(27)  and  (28)]  were 
found  by  Koop  and  Redekopp.  The  instability  is  unidirectional  in  the  sense  that 
the  long  wave  cannot  generate  the  short  wave  if  the  short  wave  is  initially  zero.  For 
the  case  of  uniform  wavetrain  with  small  perturbation,  the  instability  criterion  is 
independent  of  the  long-wave  amplitude.  They  also  showed  that  the  unstable  range 
is  0  <  k  <  2.182  in  terms  of  the  normalized  variables  where  k  is  the  wave  number 
of  small  perturbation.  Numerical  computation  was  carried  out  for  two  cases  (Ref. 
to  Figure  1  -  4),  k  =  2.1  (unstable)  and  k  =  2.25  (stable)  with  initial  conditions: 

S  =  l  +  S+etK(  +S-e~'K(  (36) 


L=  0 

Near-perfect  recurrence  was  observed  in  unstable  case. 


(37) 


(B)  Two-dimensional  Case 

The  property  of  unidirection  instability,  i.e.  the  long  wave  cannot  generate  the 
short  wave  if  the  short  wave  is  initially  zero,  is  preserved  in  2-D  case.  However, 
the  integral  term  in  equation  (35)  does  make  2-D  case  behave  differently  from  1-D 
case.  Let  us  start  the  discussion  with  the  long  term  behavior  of  the  long  waves. 
From  equation  (33) 

$fr  +  =  ~a(\A\2)( 


Since  the  nonlinear  effect  is  weak,  we  assume  that  tp  behaves  close  to  linear  system, 
i.e.  the  form  of  <j>  is  assumed  to  be 


<P  = 


F(k, l )  exp 


(38) 


The  stationary  phase  of  <p  gives  relationship 


£  =  V/4r. 


(39) 


This  means  that  the  long  waves  travel  leftward  when  they  propagate  outward.  It 
suggests  that  the  integration  in  equation  (35)  should  start  from  right-most  bound¬ 
ary  and  the  integrated  value  can  not  be  periodical  in  the  stream-wise  direction. 
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Instead  of  using  periodical  boundary  conditions  as  we  did  in  the  1-D  CL3e,  we  im¬ 
pose  zero  gradient  at  the  boundaries.  Since  a  uniform  wavetrain  with  periodical 
perturbation  can  no  longer  be  a  solution,  we  start  with  a  Gaussian  as  initial  con¬ 
dition  as  shown  in  Figure  (5).  As  the  short  wave  propagates  [Figures  (7)  and  (9)], 
the  interaction  appears  only  in  stream-wise  direction.  They  never  expand  in  the 
cross-stream  direction.  This  is  a  proof  of  unidirection  instability.  On  the  other 
hand,  the  long  waves  [Figures  (8)  and  (10)]  propagate  into  the  left  half  of  the  do¬ 
main  as  we  predicted  earlier.  The  boundary  affects  can  be  seen  in  Figure  (9)  and 
(10)  compared  with  Figures  (13)  and  (14).  They  both  have  exactly  the  same  initial 
condition  except  that  the  domain  of  the  later  case  was  doubled. 

A  couple  of  different  initial  conditions  [as  shown  in  Figure  (15)  and  (16)]  were 
tested.  They  were  so  designed  that  they  are  close  to  wavetrain  but  without  the 
periodical  property.  The  results  show  that  the  wave  number  of  the  perturbation 
does  effect  the  interaction  rates,  i.e.  smaller  wave  numbers  have  faster  interaction 
rates  as  in  1-D  case.  But  in  all  cases,  the  amplitude  of  the  envelope  of  short  waves 
and  the  amplitude  of  long  waves  only  oscillates  within  a  moderate  range.  The 
unstable  case  as  in  Figure  (l)  was  not  found. 


4-0  Conclusion 

The  equations  to  describe  the  two-dimensional  resonant  interaction  between  a 
shallow-water  gravity  wave  and  a  capillary  wave  have  been  derived.  It  is  found 
that  the  resonant  interaction  occurs  on  a  time  scale  0(c~4/3)  just  as  in  the  one- 
dimensional  case.  The  spatial  scale  in  stream-wise  direction  is  also  the  same  as 
in  the  1-D  case,  namely  0(e~ 3^3),  but  the  cross-stream  direction  is  different,  be¬ 
ing  0(e-1).  The  dynamical  characteristics  of  2-D  resonant  interaction  are  quite 
different  from  the  1-D  case,  as  illustrated  by  numerical  examples. 
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Figure  5-10  Development  of  short-long  wave  Interactions  in  the  two-dimensional 
case  of  showing  both  short-  and  long-wave  envelopes. 


Figure  11-16  Other  examples  of  the  development  of  the  short-  and  long-wave 
envelopes. 
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6.2  Optical  Flow  of  Planar  Surfaces  (S.  Ullman) 


I.  Introduction 

When  ulijcit-  iimir  hi  iln  'Mil  iri.nin.nl  will.  i  in  t  tic-  \iruir.  tin 

imago  they  casi  upnii  i)i<  rdiii.i  uiiilcrgn  complex  l ransformat ions 

Tin-  liiitiian  visii.il  —I «  ir i  caii  iiiicrpii  !  iln  »i  i  rau-f. irm.ii mil-  in  remver  tin 
i  lirw-dimctisinual  (3-0)  slruclurc  .  >1  tlic  viewed  objects  and  tln-ir  mot  inn  in 
spare 

A  substantial  number  of  cuinpuiai  innal  studies  have  investigated  this  ra¬ 
pacity  to  recover  structure  from  motion.  The  main  two  goals  of  these  studies 
have  been  (i)  to  determine  the  conditions  under  which  the  3-D  structure  can 
be  recovered  uniquely  from  the  changing  projection,  and  (li)  to  develop  meth¬ 
ods  for  computing  the  3-D  structure  and  motion  in  space  from  the  projected 
transformations. 

With  respect  to  the  uniqueness  problem,  the  main  conclusion  from  these 
studies  has  been  that  for  non-planar  rigid  objects  in  motion,  the  3-D  structure 
and  motion  are  determined  uniquely  by  the  projected  transformation.  Results 
of  this  type  have  been  established  for  both  perspective  and  orthographic  pro¬ 
jections,  and  for  different  forms  of  input  to  the  recovery  process,  including 
discrete  points  in  discrete  views,  discrete  points  and  their  velocities,  and  a 
continuous  velocity  field. 

The  uniqueness  results  obtained  in  these  studies  depended  criticially  upon 
the  object  being  non-planar.  Until  recently,  the  problem  of  interpreting  the 
motion  cast  by  planar  surfaces  has  remained  relatively  unexplored.  The  prob¬ 
lem  has  obvious  practical  implications,  since  many  surfaces  are  planar  or 
nearly  planar,  and  in  these  cases  methods  that  assume  non-planarity  would 
be  incorrect  or  unreliable.  As  mentioned  by  Waxm&n  L  Ullman  (1983)  and 
by  Longuet-Higgins  (1984),  the  analysis  of  motion  relative  to  a  planar  surface 
may  be  relevant  to  certain  situations  of  nighi -landing  of  an  aircraft,  where 
the  main  visual  cues  are  roughly  coplanar. 

Hay  (1966)  was  apparently  the  first  to  analyze  mathematically  the  visual 
interpretation  of  moving  planes.  Hay’s  analysis  assumed  that  the  visual  input 
is  given  in  the  form  of  two  discrete  views  obtained  from  a  set  of  points  in 
motion. 

The  main  result  established  by  Hay  was  that  the  interpretation  problem 
in  this  case  exhibits  in  general  a  two-fold  ambiguity.  In  addition  to  the  moving 
plane  that  has  actually  cause  the  viewed  transformation  there  is  in  general 
one  additional  “confusable"  plane.  The  second  solution  is  entirely  different 
from  the  first  in  its  spatial  orientation  and  motion  through  space. 

Similar  results  have  been  established  by  Tsai  ii  Huang  (1981),  based  on  a 
different  method  of  analysis.  They  have  also  shown  that  when  the  translattion 
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in  *|i;ki-  ill  iiliinj.'  I  Ik  n-ir/iirtl  In  lh<-*i  irhtf-.  (In  Jiml'CiiiM  d‘*.»(i  fn-nr-  at  id  I  hr 
iiiiiilmii  Ihkhiiik*  h i •  i*. | * i* 

A  recent  *lud>  l>>  l.tiiig'iet-lliuuiii*  ( |*»*«- 1 )  lui*  uleiil llied  an  addit lunal 
ri  until  n  ilt  under  vv  1 1  i  i"  1 1  (In'  t  \\  i  i-fi  tic!  :*  1 1  it  >t  a  •  i  il  x  di*:i)i)i«;ir-  Me  «!*<•  pruv  idl’d  a 
different  mi  dn.il  id  nnalysi*  and  a  diflrrem  algorithm  fur  rnriipiil nip  the  2-1) 
parameter* 

The  current  paper  extend*  the  analyst*  of  die  planar  velocity  field  (i  <•.. 
the  velocity  field  induced  by  a  mm mg  planar  surface)  in  four  directions  First, 
it  uses  a  different  form  of  input  to  the  recovery  process.  Instead  of  discrete 
points,  it  uses  continuous  flow  parameters  such  as  local  vortieity,  shear,  and 
their  derivatives.  The  use  of  flow  parametsrs  in  the  analysis  of  the  optical 
velocity  field  was  suggested  originally  by  Koenderink  L  van  Doom  (1975) 
and  by  Longuet-Higgins  A:  Prazdny  (1980).  An  analysis  based  entirely  on  this 
form  of  input  has  been  developed  recetly  by  Waxman  A:  Ullman  (1983).  When 
this  method  was  applied  to  planar  surfaces,  it  was  found  empirically  that 
there  were  in  general  two  distinct  3-D  solutions.  The  result  was  not  proven 
mathematically,  however,  and  the  possible  existence  of  additional  solutions 
was  not  ruled  out 

An  analysis  based  on  input  in  the  form  of  flow  parameters  is  of  interest  for 
two  reasons.  First,  as  emphasized  by  Koendrink  A-  van  Doom  and  by  Longuet- 
HigginsA:  Prazdny.  the  flow  parameters  are  convenient  in  the  sense  that  they 
usually  have  a  clear  geometric  interpretation,  and  some  of  them  are  invariant 
with  respect  to  the  choice  of  a  coordinate  system.  Second,  in  analogy  with 
the  non-planar  case,  it  is  of  interest  to  examine  the  interpretation  problem 
for  different  forms  of  input  since  the  analysis  of  each  case  is  usually  different, 
leading  to  a  different  recovery  method  with  somewhat  different  properties. 

The  second  direction  in  which  the  current  paper  extends  previous  inves¬ 
tigations  is  the  consideration  of  confusable  non-planar  solutions  Previous 
studies  have  assumed  that  the  moving  surface  is  known  to  be  planar,  and  ex¬ 
amined  the  number  of  possible  planar  solutions  It  is  of  interest,  however,  to 
determine  the  ambiguity  of  the  interpretation  when  the  moving  surface  is  not 
known  in  advance  to  be  planar.  It  is  shown  here  that  confusable  non-planar 
solutions  can  in  general  be  ruled  out.  Third,  the  method  of  analysis  is  differ¬ 
ent  from  previous  studies,  leading  to  a  different  family  of  possible  algorithms. 
Finally,  the  information  available  in  the  orthographic  rather  than  perspec¬ 
tive  velocity  field  of  planar  surfaces  is  analyzed  Since  under  local  analysis 
perspective  and  orthorgraphic  projections  become  almost  indistinguishable, 
this  analysis  indicates  some  limits  on  the  information  that  can  be  extracted 
reliably  from  a  local  analysis  of  the  velocity  field. 
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2.  Tlir*  Velocity  Field  of  a  Moving  Sui  f.irc 


Till'  ailrth  -I'  III  1  Ilf  pi. I  lint  M-l<  .1  II  V  Ill'll  I  Will  |  II  •  il  ffil  III  Itt.i  »I*|IS  III  I  III 
firm  stop  1  lie  projected  vclucii  v  fit  fd  w  ill  in-  expre-.M-d  In  term*-  of  I  lie  .’{-I)  shape 
ami  mill  inn  parameters  nl  tin  nulm  ini;  -nrl.n  i-  Tin  <!<--.<  ripi  <>m  will  follow 
the  derivation  in  Waxmaii  L  l  liman  {TJ83)  'JTiia-  slop  is  straightforward, 
and  t lie  resulting  2-1)  velocity  field  is  oliviou-  |\  uniquely  determined  by  the 
3-D  parameters  The  second  siep  (Sections  3.1.5)  rnnsists  of  inverting  this 
process:  given  the  instantaneous  (low  field  of  a  planar  surface,  the  problem  is 
to  recover  the  unknown  3-1)  parameters 

2.1  .Notation 

Following  Longuet-Higgins  L  Prazdny  we  will  use  a  coordinate  system 
(X,Y,Z)  moving  with  the  observer  relative  to  the  scene.  The  origin  of  the 
coordinate  system  is  the  vertex  of  the  perspective  projection  from  the  scene 
to  the  image,  and  the  Z-axis  is  oriented  along  the  instantaneous  line  of  sight. 
The  Z-axis  intercepts  the  viewed  object  at  (0,0.  Z„).  It  is  assumed  that  around 
this  point  the  object  can  be  described  by  a  tw  ice-differentiable  surface  (not 
necessarily  planar)  Z(X,Y)  Image  coordinates  will  be  denoted  by  lower-case 
letters  (x,  y)  where  I  =  j  ,y  =  j.V.V,  IT  will  denote  velocities  in  space  in  the 
X,Y,Z  directions,  and  u.  v  image  velocities  in  the  x,  y  directions.  Subscripts 
such  as  ux,uy  will  denote  partial  derivatives  along  the  x  and  y  direction. 

2.2  The  Instantaneous  Velocity  Field  Around  the  Oriein 

The  instantaneous  motion  of  a  rigid  object  can  bp  described  by  six  inde¬ 
pendent  parameters.  We  will  denote  them  by  M, , t  =  1,...6.  A/i.Afj.M?  are 
the  velocities  along  the  X,Y,Z  directions  scaled  by  the  distance  Zo,Mt  = 
U /Zo,M2~—  V/Z0,M3  ■ ~  W /Zq.  M4,  Mi,  Ma  are  the  angular  velocities 
around  the  X,Y,Z  axes.  The  shape  parameters  we  will  use  are  T|,...,Ts, 
the  surface  orientation  at  the  origin  T t  -  (fj  )0.  T2  —  )0.  and  the  sur' 

face  curvature  at  the  origin  (scaled  byZo)  7s  =  7«  =  £o(jy§)0> 

Ti  =  Zo{dj§i)0-  71,  are  assumed  below  to  be  finite. 

The  image  observables  that  will  be  used  to  recover  the  unknown  3- 
D  shape  and  motion  parameters  are  the  image  velocity  at  the  origin  (0,0) 
and  its  first  and  second  derivatives  in  the  x  and  y  directions.  Rather  than 
using  these  derivatives  themselves,  we  will  use  linear  combinations  of  them, 
as  suggested  originally  by  Koenderink  L  van  Doom  (1975)  and  by  Longuet- 
Higgins  4’  Prazdny  (1980). 
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The  first  two  observables  evaluated  at  (0,0)  are  the  image  velocity  at  the 
origin  in  the  z  and  y  directions.  The  next  four  can  be  thought  of  as  describing 
the  deformation  of  a  differentia]  neighborhood  around  the  origin  O3  and  04 
are  the  rate-of-stretch  along  the  x  and  y  axes,  O 5  measures  the  rate  of  decrease 
in  the  angle  between  line  elements  oriented  along  the  axes,  and  O&  is  the  local 
rate  of  rotation.  O7  -  0 13  are  the  spatial  derivatives  of  these  variables.  By 
evaluating  explicitly  the  various  derivatives  it  is  straightforward  to  obtain  the 
mathematical  relations  between  the  observables  and  the  unknown 

parameters  Mi,...,  Mo  Ti,...Ts  (Waxman  it  Ullman  1983.  See  also  a  similar 
derivation  in  Longuet-Higgins  it  Prazdny  1980).  The  resulting  equations  are: 


(1) 


O,  =  -Ml  -  Mi 

Or  = 

O  2  =  —  Mi  4-  M4 

o*  = 

O3  =  M3  +  M\T\ 

O0  = 

04  =  M3  -t-  A/jTj 

0 10 

O5  =  |  (A/,7,  +  A/,7,) 

On  = 

Oo  =  —Mo  +  \(M2T\  —  A/,7,) 

0 12  = 

—2(Ms  ■+  M3T1)  +  M1T3 
M4  -  MyTt  +  M,T5 
-A/s  -  M3T1  +  A/,TS 
:  2(A/«  -  MiT,)  +  M2T, 

'■  j  ( —  A/«  +  M3T3  +  A/jTs  —  A/jTs) 
:  \(-Ms  -  M3T1  ~  MiTi  +  MjTi) 


These  relations  can  be  used  for  the  recovery  of  the  3-D  shape  and  motion 
parameters  by  solving  for  the  M,  and  T,  when  the  O,  are  given  (measured 
in  the  image).  Longuet-Higgins  it  Prazdny  (1980)  have  shown  that  for  non- 
planar  patches  a  similar  set  of  equations  has  at  most  three  different  solutions. 
It  was  found  in  computer  simulations  (Waxman  it  Ullman  1983)  that  the 
solution  is  usually  unique,  but  an  analytic  uniqueness  proof  is  still  lacking. 
We  next  turn  to  the  planar  case  and  show  that  eq.  (1)  then  have  in  general 
exactly  two  distinct  solutions 


3.  The  two-fold  ambiguity  of  planar  surfaces 


In  the  planar  case  the  surface  curvature  parameters  Ts,7«,Ts  in  (1) 
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all  vanish  The  resulting  equations  are  still  coupled  and  non-linear,  and  the 
liUlnlx  r  of  (I I — i  mi  i  xiiulinii-  i-  n»i  ninneilialeis  apparein  In  I  In-  ee(  I  ion  it 
i*  show  n  that  there  are  m  general  two  ih-lin'  i  -olntion-.  In  addition  to  1 1n- 
surface  that  gave  rise  lo  the  velocity  lield  there  is  in  general  one  (and  only 
one)  additional  surface,  engaged  in  a  dillcrenl  motion,  that  can  produce  an 
identical  velocity  held 


3.1  There  are  in  general  at  least  two  distinct  solutions 


When  T3  =  7*  =  T$  =  0  the  last  four  equations  in  (1)  are  immediately 
derivable  from  the  preceding  eight  and  can  be  ignored.  If  the  resulting  system 
of  eight  equations  has  a  solution,  it  also  has  a  second  solution,  that  is  in  general 
different  from  the  first.  This  claim  is  established  by  giving  explicitly  a  second 
solution  in  terms  of  the  first.  Let  (Mi ,  ...,Mq,  Ti,T3)  be  a  solution  to  (1)  (with 
Tj  =  T4  ~  7s  =  0)  A  second  solution  (Mi,  ...M6,7\,T2)  can  be  derived 
explicitly  as  follows: 

Ti  =  -Mi/M, 

Tj  =  -M,/M, 

Mi  =  -TxM3 

Mi  =  ~T7M3  (2) 

M3  —  M3 

M  4  —  M4  —  Mi  —  M3Ti 
Ms  =  Ms  +  Mi  +  M3Tx 
Mg  =  Ms  +  MxTi  —  M3TX 

This  solution  exists  provided  that  M3  ^  0.  Ths  case  M3  =  0  is  examined 
in  section  4  below.  The  two  solutions  are  dual  in  the  sense  that  either  one 
can  be  used  in  (2)  to  obtain  the  other.  In  the  two  solutions  the  directions  of 
the  translation  vector  and  the  surface  normal  are  interchanged.  If  t,  n  are  the 
translation  vector  and  surface  normal  in  the  first  solution,  then  the  second  v 
points  in  the  direction  of  n,  and  n  in  the  direction  of  v. 

We  next  turn  to  show  that  there  are  no  more  than  two  distinct  solutions. 
This  will  be  done  in  two  stages.  Section  3.2,  which  is  the  main  step  in  the 
proof,  shows  that  if  two  planar  surfaces  have  identical  velocity  fields  then 
they  have  the  same  value  of  M3  (velocity  along  the  line  of  sight)  Section  3.3 
establishes  that  there  are  at  most  two  solutions  that  share  the  same  value  of 
M3 

3.2  Two  planes  that  induce  the  same  velocity  field  have  the  same  value  of  Ms 
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Lit  it  hikI  r.  In- 1  wh  plan<-<  engaged  m  nmt  inn- (  A/|  ..  A/,  )  ami  ( Af  | . Me.) 

ri»|iri  l  iv<  !y.  '  !..it  iodine  nil  ill  ual  kI.hiiv  liilil-  (ii  identical  oloerwtlile* 
hi  (IJj  VV  i-  i  an  a-'Hini  1 1 1  :■!  i  In  htu  plane-  inter-eft.  Otherwise.  T\ 

T\.Tj  'J'i.  winch  implies  either  (i)(A/| . A  I,.)  (Mt . A/.-.),  or  (ii)  all 

of  tin  traii'hitioti  ciimpoiient>  A/, . Ah .  A/...  M , .  A/-.  A/:(,  vam«h.  This  latter 
case,  corresponding  io  pure  rotation,  is  uninteresting  since  no  3-D  informa¬ 
tion  is  conveyed  In  the  changing  image.  Lot  l  lie  the  intersection  of  ir  and  if. 
This  special  line  in  space  participates  in  two  different  motions  and  induces  the 
same  (linear)  velocity  fields.  Without  loss  of  generality  we  can  re-orient  the 
coordinate  system  so  that  the  projection  of  f  coincides  with  the  t  axis  The 
line  (  can  be  described  now  by  Z  =  ax  -t  Z(l.  The  main  step  in  the  proof  is 
to  consider  the  intersection  line  £  instead  of  the  two  planes  This  line  has  the 
property  that  when  it  participates  in  motion  (Mt,  ...,Mc)  or  (Mi, Mo),  it 
induces  identical  velocity  fields. 

Consider  the  situation  in  which  the  planes  no  longer  exist,  only  the  single 
line  £  is  moving  in  space.  Let  it  move  with  the  3-D  motion  parameter  (Mi  - 
Mi,..., Me  -  Me)  From  the  original  coincidence  between  the  velocity  fields 
of  x  and  x,  and  since  l  lies  on  both  planes,  it  follows  that  the  velocity  field 
projected  by  i  now  vanishes,  i.e.,  it  satisfies  at  the  origin  the  equations: 

ti=0  u=0 

=  0  =  0  (3) 

f:  =  o 

We  have  transformed  the  problem  of  the  moving  planes  into  a  problem 
concerning  a  moving  straight  line.  The  question  is:  Under  what  conditions 
the  velocity  field  of  a  moving  line,  as  expressed  in  eq  (3),  vanishes?  In  the 
re-oriented  coordinate  system  let  us  denote 

Mi  -  Mi  =  Vs  M,  -M,  =  Vv  Ms  -M,  =  V. 

Mi  -  Mi  —  «Jj  M<,  -  Mi  =  Wy  Me  ~  =  vit 

The  five  equations  in  (3)  can  be  expressed  in  terms  of  the  six  motion  pa¬ 
rameters  (V'*,Vt,V,i,wv,ttI).  The  derivation  is  somewhat  lengthy,  but  straight¬ 
forward.  The  resulting  equations  are. 

Vx  ZoU'»  —  0 

V,  -  ZnW,  =  0  (4) 

V,  -  aVx  =  0 
aVM  +  ZgUy  =  0 
aVy  -  Zn«£,  =  0 


49 


Richards  &  Ullman 


Inferences  from  Images  6.2 


From  which  it  follows  that: 

1,  0  l  ,  ft  ti„  0  \  „  it  ,  n  ii,  it. 

In  term®  of  the  origin.-ii  planes  r.  t; .  the  implication  of  1  .  —  0  in  that 
M:<  —  Mx.  In  the  partiritlar  coordinates  system  in  which  f  projects  onto  the 
x-axis.  it  is  also  true  that  A/|  M ,  and  A/*  -  A/?, 


3.3  The  number  of  distinct  solutions  cannot  exceed  two 

Since  ail  the  possible  planar  solutions  share  the  same  value  of  Af.x,  the 
proof  will  be  completed  by  showing  that  for  a  given  M3  there  are  at  most 
two  distinct  solutions.  We  proceed  along  the  following  plan.  If  *  is  a  planar 
solution  let  i  be  now  the  intersection  line  of  ir  with  the  frontal  plane  Z  =  Z0. 
We  will  call  t  the  “tilt  line”  of  ».  If  we  re-orient  the  coordinate  system  so  that  t 
runs  along  the  x-axis,  then  in  the  new  coordinate  system  Tj  =  0.  From  eq.  (1), 
in  this  reoriented  coordinate  system  O3  =  M3.  It  can  be  observed  from  eq.(l) 
that,  if  we  exclude  solutions  in  which  Tj  =  0  and  Mi  =  0  simultaneously, 
the  a  fixed  M3  and  Tj  =  0  determine  the  solution  uniquely.  W’e  will  show 
that  there  are  at  most  two  orientations  of  the  coordinate  system  for  which 
Os  =  M3.  This  will  imply  that  (except  for  two  special  cases  that  will  be 
examined  separately)  there  are  at  most  two  distinct  tile  lines  and  therefore  at 
most  two  distinct  solutions.  We  will  assume  here  Ms  £  0,  the  case  M3  =  0 
is  examined  in  section  4.  It  is  convenient  for  the  proof  to  assume  that  the 
velocity  field  satisfies  initially  O3  =  0«.  This  can  be  assumed  without  loss  of 
generality,  since  it  is  always  possible  to  satisfy  the  assumption  by  re-orienting 
the  coordinate  system  (choosing  new  x,  y  coordinates)  in  the  following  manner. 
Let  us  rotate  the  coordinate  system  along  the  line  of  sight  by  an  angle  0  and 
denote  the  eight  observables  in  the  new  coordiante  system  by  (0,,...,0g) 
We  will  determine  an  angle  0  such  that  following  the  rotation  O3  =  0«.  In 
the  rotated  coordinate  system: 

O3  =  Oseos70  -t  OiStn70  L  2Oi8in0cos0 
O 4  =  OiCos70  ■+  O3sin70  -  2OiStn0cos0  (5) 

to  obtain  O3  —  0«,  0  must  satisfy 

(Os  -  0^)00820  +  2Oi,8in20  =  0 

assuming  0%  O,  0  is  determined  by 

tan20  =  2*^  (6) 
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Then  will  alwai-  hr  u  «ulutinn  for  ?  (Not  iiot-estarily  unique  The  raw 
(}■  u  will  lx  cxaiiuiM-d  M-paraieK  )  V\r  ran  ax-unx  iliereforc  iliai  *i-  have 
mil  iall\  a  mi  ininibl  c  *\  "iieiii  in  w  Inrli  O;.  ( b  \N  e  now  ml  at  c  <  lie  coord  mat  e 

<\  si  cm  l>\  a  new  angle  o .  and  denote  again  I  he  observables  before  t  lie  rol  at  ton 
hr  (Oi . 0,1  and  following  the  rotation  h>  (f^t.  SKI  He  seek  an  angle  a 

for  which  O,  My  In  general  (J:.  O.rnt-n  (J4  fin’o  20.,*in  ow  o 
But  since  Oy  -  O 4 

Oy  —  O;  -  20 '.sin  a  cos  q  (7) 


assuming  0*.  *  0 

«tn2a  = 

There  are  four  solutions  for  a  in  the  range  |0, 2rr ,,  01,  o2,  Qj  t  x.oj  + 
n  The  solution  a,, a,  +  x  are  equivalent,  they  give  the  same  solution  for 

This  establishes  the  claim  for  the  general  case  Two  special  cases  that 
were  excluded  from  the  proof  can  be  analyzed  in  a  similar  manner.  In  the 
first  case  there  is  a  solution  for  which  T|  =  0  and  Mi  =  0  simultaneously.  In 
this  case  the  velocity  field  will  have  a  single  direction  along  which  Os  =  Ms 
Mi ,  M3,  Ms,  T\  are  then  determined  uniquely,  but  there  are  two  solutions  for 
Mi,  T2,  in  (1).  In  the  second  case  Os  =  0  for  every  orientation  of  the 
coordinate  system.  In  this  case  there  are  two  solutions,  in  one  7"i  =  T2  —  0 
(frontal  plane)  and  in  the  other  Mi  =  M 2  —  0  (motion  along  the  line  of 
sight).  In  all  of  these  cases  the  velocity  field  admits  at  most  two  planar 
interpretations. 

4.  Unique  Solutions 

In  the  previous  section  ii  was  shown  that  the  velocity  field  of  a  moving 
plane  is  compatible,  in  general,  with  exactly  one  additional  moving  plane. 

There  are  two  special  cases  under  which  the  two-fold  ambiguity  disap¬ 
pears,  and  a  degenerate  case  under  which  the  surface  orientation  cannot  be 
recovered  The  discussion  of  these  cases  will  be  brief,  since  one  of  these  cases 
is  discussed  in  Ullman  k  Waxman  (1983)  and  the  other  in  Longuel-Higgms 
(1984). 

The  degenerate  case  arises  when  Mi  =  Mj  =  M*  =  0.  The  motion  in 
this  case  is  pure  rotation.  The  motion  parameters  can  be  recovered,  but  the 
surface  orientation  remains  ambiguous. 

One  unambiguous  case  arises  when  there  is  no  relative  velocity  along  the 
observer’s  line  of  sight,  i.e.  Mj  =  0  (but  Mi  and  Mj  are  not  both  tero)  In  this 
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ras*-  t  lie  mm  i.itt  equal  n>ir-  ran  be  »ulv  ed  explu  it  Iv  and  I  lie  «n|ijt  ion  is  unique 
1  lie  -e<  olid  imai  iilug  u<  hi-  i  a-i-  .iri»<--  w  l,t  n  A/  u  ami  I  lie  oli-erver  liiove« 
ilirnil)  low.iril-  or  avv.n  from  tin-  -uriaii-  If  (In  original  inolion  satisfice 
A / ]  A/;,  7’|  .  A / j  A/;..  7  j  I  In  n  ii  ran  In- vei  n  from  equal  ion  (2)  dial  1  lie 

MTouii  -olntion  (.mu  uK".  «illi  da  lii-t  Tin  I ; 1 1 1 1 •  r  ioikIiIioii  discussed  in 
Longuet-IItggins.  (198-1)  lie  also  suggested  an  additional  condition  that  can 
be  used  In  resolve  die  ambiguity.  l>\  using  an  extended  region  of  the  plane 
rather  lhan  die  local  velocity  field. 

In  the  previous  section  we  have  left  out  the  case  Os  ~  0.  Inspection 
of  the  equations  reveal  that  this  case  falls  into  one  of  the  categories  already 
discussed.  One  case  where  Os  =  0  is  obtained  when  M\  —  M2  =  M3  =  0, 
the  ambiguous  case  discussed  above.  A  second  case  is  when  Mt  —  M2  =  0 
(but  Mi  ^  0)  and  T\  =  Tj  =  0.  In  this  case  the  motion  is  along  the  surface 
normal,  and  the  solution  is  unique.  In  all  other  cases  there  are  two  distinct 
solutions. 

4.1  Summary 

The  ambiguous  and  non-ambiguous  solution  can  be  summarized  as  fol¬ 
lows. 

1.  In  the  pure  rotation  case  the  motion  can  be  recovered  but  the  surface 
orientation  remains  ambiguous. 

2.  If  there  is  no  relative  velocity  along  the  line  of  sight  (Ms  =  0  but  MtM2  # 
0),  or  if  the  motion  satisfied  Mi/ M3  —  -Tj,  M2/M2  ~  ~T2  (motion 
perpendicular  to  the  plane),  then  the  solution  is  unique. 

3.  In  all  other  cases  the  local  velocity  field  has  a  two-fold  ambiguity.  The 
second  solution  can  be  derived  in  terms  of  the  first  by  equation  (2). 

S.  The  exclusion  of  non-planar  solutions 

We  have  seen  in  the  previous  section  that  the  velocity  field  of  a  moving 
plane  has  in  general  one  additional  planar  interpretation.  That  is,  if  the 
moving  surface  is  known  to  be  a  plane,  there  are  in  general  two  distinct 
solutions  The  possibility  remains,  however,  that  when  nothing  is  known  in 
advance  about  the  surface,  there  are  additional  non-planar  interpretations. 
In  this  section  the  question  of  non-planar  ambiguities  is  considered.  In  other 
words,  the  question  is  whether  in  addition  to  the  two  planar  interpretations, 
non-planar  solutions  are  also  possible. 

Let  r  be  t  moving  plane  and  let  p  denote  the  vector  (Mu...,Mq)  of  its 
motion  parameters.  Suppose  that  the  twice-differentiable  surface  P  is  non- 
planar  around  the  origin  and  that  it  induces  the  same  velocity  field  as  ir,  and 
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let  ;i  denote  the  iiiuiimi  parameters  of  P  Jl  is  shown  that  if  the  velocity 
field'  of  P  and  s  naniuh  in  a  neighborhood  arouini  tin  origin,  then  P  is 
in  fact  planar  at  tin  origin  That  is,  if  7T .  T  i .  7'-.  an  the  surface  curvature 
parameters  of  /'  at  the  origin  then  ]'■•.  -  T.t  *  7’r.  =  U. 

\\  jlliout  lus'  ol  generality  we  can  assume  that  both  surfaces  pass  through 
the  point  (0.0.  Z. .}  We  ran  then  distinguish  between  two  cases 

Case  J :  rr  it.  tangent  to  P  at  (0,0.  Z,,)  In  this  case  it  and  P  have  the  same 
values  for  7V  7V  From  the  original  equation  (1)  it  can  be  verified  that  the 
only  ambiguous  configuration  in  this  case  is  when  and  M 2  in  (l)  both 
vanish  That  is.  the  motion  is  directed  along  the  line  of  sight.  In  all  other 
cases  P  must  coincide  at  the  origin  with  ir,  and  satisfy  7'3  =  7'4  =  7'$  =  0 
Case  2:  tr  intersects  P  along  some  space-curve  c.  Let  7  be  the  projection  of  e 
on  the  image  plane  Assume  first  that  near  the  origin  7  is  not  a  straight  line 
segments.  Longuet-Higgins  (1984)  has  shown  that  given  the  image  velocities 
of  four  coplanar  points  (no  three  of  which  are  colinear  in  the  image),  the  3-D 
motion  parameter  can  be  recovered  up  to  the  two-fold  ambiguity  discussed  in 
the  previous  section. 

The  implication  is  that  the  motion  p  coincides  either  with  p  or  with  ji, 
the  motion  of  the  dual  solution  to  In  either  case  we  obtain  that  planar  and 
non-planar  surfaces  with  identical  motion  parameters  produce  an  identical 
velocity  field  From  the  original  equation  (1)  it  can  be  verified  that  the  only 
ambiguous  configuration  in  this  case  arises  again  when  A/j  =  A/j  =  0  If  A/3 
also  vanishes,  (the  pure  rotation  case),  the  situation  is  inherently  ambiguous, 
as  discussed  in  the  previous  section.  If  A/3  jt  0  then  *  is  in  fact  tangent  to  P, 
as  in  the  previous  case. 

The  only  remaining  possibility  is  that  near  the  origin  7  is  a  straight  line 
segment.  In  this  case  we  can  use  the  results  of  section  3.2.  Without  loss  of 
generality  7  can  be  assumed  to  lie  along  the  x  axis.  Let  A/,,7’,  denote  the 
motion  and  shape  parameters  of  j,Mx,T,  of  P.  From  3.2,  it  follows  that 
A/3  =  A/3,A/]  =  Mi,  Mi  —  Mi .  In  addition  7)  =  Ti  and  T3  =  T$  since 
both  are  measured  along  c  which  is  straight  line  in  space  common  to  *  and 
P,  and  7N  =  0  since  *  is  planar.  T4,T 5  can  now  be  analyzed  using  eq  (1). 
The  equality  of  the  observable  O0  in  (1)  implies  MiTi  =  A/jTs,  and  since 
7s  =  0,  MiTi  =  0.  Similarly  and  Ou  together  imply  M  tT  4  =  0,  and 
also  M\T4  —  0  since  M\  =  M\  If  M\  ^  0,  T4  =  0.  From  Oto  together 
with  Og  it  now  follows  that  MiTi  =  M]T$  and  therefore  Ts  =  0.  The  case 
Mi  =  0  can  be  analyzed  in  a  similar  manner. 

The  final  conclusion  is  that  Ti  =  T4=Ti=0  except  for  the  case 
M]  =  A/j  =  0  (motion  along  the  line  of  sight) 


53 


Richards  tc  Ullman 


Inferences  from  Images  6.3 


In  roni  Iii»iiiii .  wr  (hii  iii-1  iii^iii'li  lii-mi-i  ii  two  <  If  tin'  relative  mo 

1 1< >n  happen'  in  lie  al<ni“  I  In  Inn  uf  -itln  .  i  In  i'  i  In-  luc  al  velocity  Held  uf  a 
plain  r,  i»  aUu  i  mnpal  il.lt-  with  any  inm-plaiiar  -nrfaic  with  I  lie  'amo  rela¬ 
tive  motion  parameter;-.  and  w Imse  tangent  plane  ruirinde-  with  it.  Unlike 
tin  planar  two-fold  ambiguity .  tlii'  aiiiliii-'int  y  t-  loral.  and  tan  lie  resolved  In 
inspecting  a  larger  region  of  tlic  plane 

In  the  more  general  ease,  in  which  the  motion  eomponent  parallel  to  the 
image  plane  does  not  vanish,  non-planar  solutions  can  be  ruled  out,  and  the 
only  remaining  ambiguity  is  the  twofold  planar  ambiguity. 


6.  Algorithm 

The  proof  in  section  3  although  not  entirely  constructive,  leads  to  a  possi¬ 
ble  algorithm  for  computing  the  planar  solutions.  The  method  will  be  outlined 
briefly  A  different  algorithm  has  been  suggested  by  Longuet-Higgins  (1984). 

We  begin  by  rotating  the  coordinate  system  by  an  angle  8  to  obtain 
O a  —  04  (where  O,  are  the  observables  in  the  rotated  coordinated  system). 
From  (6)  we  obtain 

tanlO  = 

(Provided  that  0%  ^  0).  We  can  therefore  obtain  a  solution  (non  unique) 
for  ain6,cos6.  It  is  a  straightforward  computation  to  then  compute  the  ob¬ 
servables  (Ot,...,Os)  (Waxman  it  Ullman,  1983). 

The  planar  motion  equations  (first  eight  equations  in  (1))  can  be  viewed 
as  eight  linear  equations  in  12  unknown:  Mj,...,Mc,  and  Ai,...,As.  when 
Xi  =  MjT, ,  A2  =  M2T2,  X3  =  M,Tj.  X4  =  M2TU  A*  =  MjT,,  Ac,  = 
M3T2.  The  eight  equations  are  linearly  independent.  Furthermore,  they  can 
be  divided  into  four  groups  of  two  equations. 

(M3, Aj,A2)  appear  only  in  equations  (3,4). 

(MG,  As,  A«)  appear  only  in  (5,6). 

(Mi,  Ms,  Xi)  appear  only  in  (1,7). 

(M2,M«,Ac)  in  (2,8). 

As  a  result,  each  set  of  unknowns  can  be  solved  up  to  a  single  scalar: 

(Ms,  Xu  X2)  =  (ms,x,,xj)  -l-  a(ms,x,,x2) 

(M6,  As,  A«)  =  (m0,X3,*4)  +  0(mo,x3  ,x4)  (8) 
(M,,Ms,A5)  =  (m,,m5,x5)  +  p(mi,ms,X5) 

(Mi,  A6)  =  (m2,m4,x6)  +  <5(m3,m4,2 0) 
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l  li»-  m  .in  (i  I.  arc  <  l«-t  «-r  1 1  ii  im-cI  In  *nlvmj>  *et  •»  of  I  wn  eqii<t- 

I  mu*  in  i  Itri-c  ii n k in n*  Tin-  *<  hl<ir-  .1 .  i  r1111.nn  In  In-  til  l  criiiiiii  fl  Since 
(>:■  (>4.  ii  follow*  frinii  (I)  ili.ii  .i/j'/'j  AA7...  11  .Vj  ,V».  ilu-rc-fore 

7)  *  nr  1  =  ii  ■  a  ii  and  hence  a  1*.  determined.  Two  possible  values  for 

6, 'j.t  are  obi  allied  from  the  equation*. 

A* ,  A*  2  r  A' A*  4 

M?. X,  -  M,X-  (9) 

A/a  A' j  =  M3Xb 

respectively  Finally  we  choose  a  value  for  (a, $,1.6)  to  satisfy  the  fourth 
independent  relation  M3X\  -  M\X4  There  will  be  at  least  one  such  set 
of  values  for  (a,  6,1,6),  and  at  most  two.  When  one  solution  is  known,  the 
second  can  be  found  immediately  using  equations  (2). 

7.  The  orthographic  velocity  field 

Previous  sections  have  established  that  the  parameters  of  a  moving  plane 
can  be  determined  up  to  the  twofold  ambiguity  from  an  arbitrarily  small 
patch  near  the  origin.  When  the  viewed  surface  patch  is  small,  perspective 
effects  become  small,  and  the  recovery  process  may  become  unreliable.  It 
therefore  becomes  of  interest  to  analyze  the  case  of  orthographic  projection 
where  perspective  effects  play  no  role.  (In  orthographic  projection  a  space 
point  X,  Y,  Z,  projects  to  an  image  point  x  -  X,  y  =  Y .) 

7.1  The  orthographic  velocity  field  of  non-planar  surfaces 

Let  5 1  be  a  non-planar  surface  moving  in  space.  For  the  orthographic  case 
it  can  be  assumed  that  Sj  is  fixed  at  the  origin  (0,0,  Zq)  since  the  translation 
components  are  immediately  recoverable.  The  rotation  of  5j  can  always  be 
decomposed  into  the  sum  of  two  components:  a  rotation  with  angular  velocity 
w  (assumed  to  be  non-zero)  about  an  axis  lying  somewhere  in  the  frontal 
plane  (x*rotation)  and  a  component  (z-rotation)  about  the  line  of  sight  Z 
with  angular  velocity  9. 

A  second  surface  S3  is  said  to  be  a  depth  tcaling  of  S\  if: 

1  For  every  point  (z,y,  *)  on  St ,  (x,y,kt)  is  a  point  on  Sj  for  some  constant 

k  (k  #  0). 

2.  The  rotations  w j  and  w3  are  around  the  same  axis,  and  uij  =  wj/k. 

3.  0j  =  02 

For  non-planar  surfaces  the  following  proposition  can  be  established.  If 
Sj  is  a  possible  rigid  interpretation  of  a  given  orthographic  velocity  field,  then 
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Vj  is  another  possible  interpretation  if  and  only  if  it  is  ofnatnerl  from  V,  via 
dept h  sc  abiip 

Note  that  if  ft,  II..  0  then  >,  ami  have  <  1 1 1  f  t  rent  iii'l  ant  acieoci' 
axes  of  rotation  in  spare  The  urt hograplnr  velcxm  field  therefore  does  riot 
determine  uniquely  the  rotation  axis  in  spate 

Since  our  concern  here  is  primarily  with  planar  surfaces,  the  proof  uill 
be  omitted,  it  can  be  found  in  (1‘llrnan  I9H3) 

7.2  The  orthoeraphic  velocity  field  of  planar  surfaces 

In  the  planar  case  the  twofold  ambiguity  of  planar  surfaces  is  combined 
with  the  inherent  depth-scaling  ambiguity  of  orthographic  projection.  As  a 
result  the  orthographic  projection  of  a  planar  surface  admits  two  interpreta¬ 
tions,  each  defined  up  to  depth  scaling. 

For  simplicity  of  the  analysis  we  can  assume  that  in  the  planar  velocity 
field  all  the  velocity  vectors  are  parallel  to  the  x  axis.  (If  the  inducing  object’s 
Z-rotation  is  9,  then  by  rotating  the  observed  velocity  field  by  -6  all  the 
velocity  vectors  will  become  parallel.  Their  direction  can  be  taken  as  the 
x-axis.) 

The  velocity  field  u(x,y)  v(x,y)  now  has  the  form: 

u(x,y)  =  ax  +  ffy 

»(*,»)  =  0  (10) 

If  (uiz,  is  the  angular  velocity  vector  of  the  rotating  surface  (as¬ 

sumed  to  be  non-zero)  then. 

u?vz  -  wMy  =  ax  +  0y 

w,z  -  wx  z  =  0  (11) 

One  solution  to  these  equations  arises  when  wx  =  0.  This  implies  vjx  =  0 
(if  z  is  not  identically  zero),  and  z  =  ^(ctr+  f)y).  This  solution  corresponds 
to  a  plane  rotating  about  the  vertical  axis. 

If  w,  yf  0  then  wx  ^  0  also  and  z  =  x.  This  solution  is  also  a  plane, 
with  a  tilt  line  along  the  x  axis. 

These  two  possible  interpretations  cannot  be  resolved  on  the  basis  of  the 
instantaneous  velocity  field.  How  much  additional  information  is  required 
to  guarantee  a  unique  solution?  For  non-planar  objects,  it  can  be  shown 
(Ullman  1983)  that  one  additional  view  is  sufficient  to  remove  the  depth¬ 
scaling  ambiguity  For  planar  objects,  the  problem  is  open. 

The  orthographic  velocity  field  is  thus  inherently  more  ambiguous  than 
the  perspective  one  Instead  of  two  solutions  there  are  two  families  of  solu¬ 
tions,  each  determined  up  to  depth  scaling 
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The  addit mnal  ambiguity  of  t lie  orthographic  velocity  field  implies  that 
under  local  aual\«i*  (i.t-  H-mig  a  Miiall  Mirfai e  patch)  the  3-D  nriiwv  proct—*- 
i>  nut  (‘lit  irtl\  ~i  ;i !  ■!•  •  \-|.itI~  of  i  In  3-D  structure  that  arc  invariant  under 
depth  scaling  are  expected  to  lie  more  stable  than  others.  For  planar  sur¬ 
faces  tlicM  nr. .mailt -  include  the  orientation  of  tin  lilt  line  and  the  rotation 
component  around  the  line  of  -iglit  J’arameters  t hat  are  noi  invariant  under 
depth  scaling  such  a-  surface  slant  are  expected  to  he  less  robust. 

8.  Summary 

1.  The  velocity  field  of  a  planar  surface  exhibits  in  general  a  two-fold  am¬ 
biguity.  In  addition  to  the  moving  plane  that  has  actually  induced  the 
viewed  transformation  there  is  one  additional,  and  in  general  entirely 
different,  planar  solution. 

2.  There  are  special  cases  in  which  the  interpretation  of  the  local  velocity- 
field  becomes  unique.  These  cases  are  (i)  Afj  =  0  (but  M\Mt  ?  0), 
and  (ii)  motion  directed  towards  or  away  from  the  surface.  A  degenerate 
case  arises  for  pure  rotation.  In  this  case  the  motion  parameters  can  be 
determined,  but  the  3-D  structure  remains  undetermined. 

3.  Additional  non-planar  solutions  are  in  general  excluded.  The  exception 
is  the  case  of  motion  directed  parallel  to  the  surface  normal. 

4.  The  two  planar  solutions  can  be  computed  from  the  eight  kinematic  ob¬ 
servables,  using  the  algorithm  in  section  6. 

5.  If  one  of  the  two  planar  solutions  is  known,  the  dual  solution  can  be 
expressed  in  terms  rf  the  first  using  eq.  2. 

6.  In  the  orthographic  case  there  are,  instead  of  two  solutions,  two  families 
of  solutions,  each  determined  up  to  depth-scaling.  It  is  expected  that 
for  the  perspective  case  only  3-D  parameters  that  are  invariant  under 
depth-scaling  would  be  robust  under  local  analysis. 

The  results  explored  in  this  paper  are  theoretical  in  nature.  They  set 
some  limits  on  the  performance  of  any  motion  perceiving  device.  It  is  un¬ 
known,  however,  to  what  degree  the  human  visual  system  can  approach  these 
theoretical  limits  It  may  be  of  interest,  therefore,  to  test  psychophysically 
some  of  the  implications  of  the  above  analysis.  For  example: 

— Can  subjects  interpret  the  velocity  field  of  planar  surfaces?  (i.e.,  make 
some  reliable  judgements  of  relative  motion  parameters  and  surface  orienta- 


57 


Richards  &  Ullman 


Inferences  from  Images  8.2 


lion).  Rc«uli!»  in  (fiil'«in  </.  nl  10">9)  indicate  t it<i  1  under  <orne  condition 
tlii'  i'  |m i"il>lc  till  lions'll  t  Ik  accuracy  i'  probably  not  very  Ingli 

Can  they  ml  i  rprct .  al  hast  to  «niiie  degree,  the  planar  velocity  lields 
under  brief  present  a  I  ion'.’  Jf  l  hex  do.  how  do  they  handle  the  inherent  twofold 
ambiguity  ? 

--('an  observers  interpret  orthographic  velocity  fields?  Can  they  recover, 
for  example,  the  tilt  of  one  or  botli  planar  solutions? 

Answers  to  these  questions  ntay  give  us  a  better  insight  into  the  process¬ 
ing  employed  by  the  human  visual  system  in  the  recovery  of  structure  from 
motion. 
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6.3  Rigidity  and  Smoothness  of  Motion  (S.  Ullman  and 
A.  Yuille) 

The  smoothness  assumption  in  measuring  visual  motion 

The  problem  of  measuring  visual  motion  is,  in  many  situations,  undercon¬ 
strained.  That  is,  the  information  in  the  changing  image  is  insufficent  to 
determine  the  motion  uniquely.  This  indeterminateness  is  often  referred  to  as 
the  “aperture  problem”  (for  example  Marr  and  Ullman,  1982).  An  additional 
constraint  is  therefore  required  for  resolving  the  ambiguity  and  determining 
the  velocity  field  uniquely.  An  important  constraint  that  has  been  proposed 
for  solving  the  problem  is  a  smoothness  constraint.  (Fennema  and  Thomp¬ 
son  1979,  Horn  and  Schunck  1981,  Hildreth  1984,  Nakayama  and  Silverman 
1986).  When  applied  to  the  problem  of  measuring  the  velocity  of  image  con¬ 
tours,  this  smoothness  assumption  has  been  formulated  by  Hildreth  in  the 
following  manner.  Of  all  the  velocity  fields  that  are  consistent  with  the  trans¬ 
forming  image,  select  the  velocity  field  that  minimizes  the  overall  variation. 
That  is,  if  v  denotes  the  velocity  at  a  point,  and  «  the  arc  length  along  the 
curve,  the  preferred  velocity  field  is  the  one  that  minimizes  the  integral: 

/■£■** 

along  the  curve  (the  principle  of  least  variation).  It  has  been  shown  that 
this  method  of  determining  the  velocity  field  gives  good  results  under  a  wide 
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range  of  conditions,  and  it  seems  to  correspond  in  many  cases  to  the  velocity 
field  perceived  by  human  observers  (Hildreth  1986,  Nakayama  and  Silverman 
1986). 

Justifying  the  smoothness  assumption. 

The  main  rationale  raised  in  support  of  the  smoothness  assumption  is  that 
the  velocity  field  induced  by  smooth  contours  in  motion  is  expected  to  be 
smooth.  This  argument  is  insufficent,  however,  for  justifying  the  principle  of 
least  variation.  The  smoothness  assumption  implies  that  if  the  object  is  known 
to  be  smooth,  the  velocity  field  should  not  contain  discontinuities.  It  does  not 
imply,  however,  that  uthe  smoother  the  better”.  Consider,  for  instance,  two 
points  in  the  image  separated  by  a  small  distance,  and  moving  with  image 
velocities  Vt  and  V2.  One  may  argue  that  if  they  lie  on  the  same  surface, 
extremely  high  values  of  ||Vi  -  Vj||  are  unlikely,  because  of  some  physical 
limitations  on  the  motion  of  objects  in  space.  But  why  should  one  assume 
that  a  relative  velocity  of,  say,  0.2  deg/ sec  is  less  likely  than  0.1  deg/sec?  Such 
a  preference  is  assumed  by  the  least  variation  principle  (that  favors  velocity 
fields  in  which  the  difference  in  velocity  between  neighboring  points  is  as  small 
as  possible),  but  cannot  be  defended  only  on  the  basis  of  a  general  smoothness 
argument.  Clearly,  a  stronger  constraint  than  just  the  general  smoothness  of 
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surfaces  is  required. 

Another  general  property  that  seems  to  provide  a  useful  constraint  in  the 
analysis  of  visual  motion  is  rigidity.  Computational  studies  have  shown  that 
the  3D  stucture  of  rigid  and  quasi-rigid  objects  can  be  recovered  by  looking 
for  the  mo6t  rigid  interpretation  possible  of  the  changing  image  (Ullman  1979, 
1984). 

In  this  paper  we  argue  that  local  rigidity  of  the  object  and  the  principle  of 
least  variation  in  the  velocity  field  are  related.  To  investigate  this  we  consider 
a  rod  moving  in  three-dimensional  space.  The  rod  is  allowed  to  move  in  a  semi¬ 
rigid  manner:  it  can  rotate  and  translate  freely  and  is  also  allowed  a  certain 
expansion.  We  calculate  the  relative  velocity  between  the  two  endpoints  of 
the  rod,  as  projected  on  the  image  plane.  The  calculations  show  that  under 
a  wide  range  of  conditions  this  velocity  distribution  is  peaked  at  zero  velocity 
and  decreases  monotonically  at  higher  velocities.  As  a  result,  the  projected 
velocity  field  of  points  linked  rigidly  together  is  likely  to  be  consistent  with 
the  principle  of  least  variation  in  the  velocity  field.  There  are  two  factors 
contributing  to  this  strong  bias  towards  small  differential  velocities  in  the 
projected  velocity  field:  (i)  the  rigidity  of  the  link  between  the  two  points  and 
(ii)  the  effect  of  projection  from  3-D  to  the  2-D  imr  •  -  plane. 

If  we  imagine  an  object  being  made  up  of  a  collection  of  semi-rigid  rods 
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this  analysis  suggests  that  the  projected  velocity  field  is  likely  to  be  smooth. 
We  conclude  that  the  principle  of  least  variation  is  a  reasonable  constraint 
that  can  be  justified  for  the  projected  velocity  field  of  rigid  or  locally  rigid 
objects. 

Section  1.  The  two-dimensional  case. 

First  we  consider  a  rigid  rod  moving  in  two  dimensions  and  being  pro¬ 
jected  onto  a  line.  Because  it  is  rigid  the  rod’s  velocity  can  be  split  into 
rotational  and  translational  components.  For  our  purposes  the  translational 
component  can  be  ignored  as  it  will  not  contribute  to  the  derivative  of  the 
velocity  field  on  the  line. 

Let  the  projected  length  of  the  rod  be  R  and  the  real  length  be  r,  the 
angle  to  the  vertical  be  9  and  the  angular  velocity  be  u.  Then  the  projected 
velocity  distribution  #(u)  is 

«(ti)  =  J  6(R  —  rsin9)6(u  -  ru>  cos  9)9r(r)^u(uf)drtL>d9  (1.1) 

where  6  denotes  the  Dirac  delta  function,  tr(r)  and  ♦*(<*>)  are  the  distribution 
of  r  and'u;  respectively.  r»in9  and  rwcosff  are  the  projected  lengths  and 
velocity  respectively.  We  assume  that  rods  are  equally  likely  to  have  any 
length  between  0  and  rm.x  and  set  #r(r)  =  k  between  0  and  rmaz  (with 
krm„  =  1). 
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We  can  integrate  with  respect  to  9  using  the  first  delta  function.  This 
gives  (see  Appendix  1) 


♦(it)  =  f  ( v)drdu>  (1.2) 

Jr=o  Vr 2  -  R? 

We  integrate  with  respect  to  r  to  obtain 


iu(u)du 
(A?u»J  +  uJ)i 


(1.3) 


where  u>mj„  =  u/VTmax~  This  lower  bound  comes  from  the  delta  func¬ 
tion  integration. 

Now  we  examine  the  behavior  of  $(u).  If  we  differentiate  it  with  respect 
to  tt  we  get 


_  _  f°°  J.  ■  _ $w(t*nin) _  (  4x 

a»  *  L.....  (JP «»+.')»  +  1 '  • 

So,  /or  any  probability  distribution  ♦„(<*/)  (which  by  definition  must  be  a 

positive  function)  the  two  terms  on  the  right  hand  side  of  (1.4)  must  be 
negative.  $(u)  must  decrease  strictly  monotonically  with  u  and  has  a  unique 
maximum  at  u  =  0. 

Thus  whatever  the  distribution  of  the  rotation  the  meet  likely  projected 
velocity  is  zero.  This  result  is  similar  to  that  obtained  by  UUman  (1979) 
for  the  motion  of  dots.  It  was  shown  that  if  the  motion  of  a  random  dot  is 
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described  by  an  isotropic  probability  distribution  function  in  three  dimensions 
then  the  probability  distribution  for  the  projected  two-dimensional  motion  is 
peaked  at  0. 

Note  that  our  result  is  independent  of  whether  we  choose  an  upper  bound 
for  r.  This  can  easily  be  seen  by  setting  rmax  *-►  oo  and  v  *-►  0  in  (1.4).  In 
fact  having  an  upper  bound  for  r  makes  di/du  more  negative  and  makes 
♦(ti)  decay  faster.  Henceforth  we  assume  for  simplicity  that  there  is  no  upper 
bound  for  r. 

To  see  the  connection  with  the  smoothness  assumption  observe  that  in 
the  limit  Hildreth’s  smoothness  measure  can  be  expressed  as 

/  l£r  *-»}>'  (IS) 

where  the  sum  can  be  taken  over  a  set  of  rigid  rotating  rods.  The  results  above 
show  that  for  each  rod  considered  independently  the  probability  distribution 
of  the  u  is  peaked  at  zero.  Thus,  with  this  independence  assumption,  we 
argue  that  the  most  likely  distribution  of  the  whole  contour  is  the  one  that 
minimizes  (1.5).  We  will  relax  the  independence  assumption  in  section  4. 

Section  2.  The  three-dimensional  case. 

We  now  extend  the  analysis  to  a  rod  moving  in  3-space.  Consider  a  rod 
of  length  r  projected  into  the  image  plane,  which  has  unit  normal  vector  k. 
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The  rod’s  direction  in  3-D  space  is  denoted  by  r  which  is  a  unit  vector  in  the 
direction  r.  The  rod  is  projected  into  a  rector  rp  where 

rp  =  f-(f-k)k.  (2.1) 

The  rod’s  rotation  is  described  by  a  vector  <3,  where  u  is  the  frequency  and 

<3  is  the  axis  of  rotation.  The  velocity  v  is  given  by 

v  =  <3  x  r.  (2.2) 

The  projected  velocity  field  vp  is  given  by 

vp  =  F-(v-k)k.  (2.3) 

We  consider  rods  with  projected  length  R  and  projected  velocity  V .  We  are 
interested  in  the  projected  velocity  distribution  *(t7).  Since  the  projected 
velocity  distribution  is  rotationally  symmetric  we  can  express  this  as  U$(U) 
(see  Appendix  1).  If  the  rods’  length  distribution  is  $r(r)  and  their  rotation 
distribution  is  $w(<2)  then  the  projected  velocity  distribution  is  given  by 

*(«)  =  J HR-  \rpmU  -  \vp\)*r(r)*u(*)d?du.  (2.4) 

Care  is  needed  in  specifying  the  domain  of  integration  of  (2.4).  From  (2.2) 
we  see  that  only  the  component  of  Q  perpendicular  to  r  contributes  to  the 
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velocity.  We  mutt  require  <2  •  F  =  0.  We  impose  this  by  substituting  &  delta 
function  into  (2.4)  and  then  integrating  over  all  <3. 

Thus  we  have 


V*{U)  =  J  6(R-  IfplWw  •  f)S{U  -  |ffp|)**(<3)dfd<3.  (2.5) 

We  assume  all  directions  of  the  rod  are  equally  likely  and  the  rods  are 
also  equally  likely  to  have  any  length.  So  we  set 


(2.6) 

We  also  assume  that  the  rotation  is  equally  likely  to  be  in  any  direction.  Then 
♦w(<3)  is  a  function  of  |w|  =  u  only. 


U*{U)  =  fs(R-  |r*  |)d(w  •  f)S(U  -  |fF|)#w(u;)drm  (2.7) 

Choose  an  orthogonal  basis  t,  J,  k.  Define  angles  9,  <p,  a,  0.  These  angles 
are  illustrated  in  figure  1.  9  is  the  angle  between  the  unit  vector  and  the  k 
axis.  i a  the  angle  between  the  unit  vector  projected  onto  the  z,y  plane  and 
the  x-axis.  Similarly  for  a  and  0.  It  follows  that 


?  =  sin9cos<pi  +  $in9  sirup]  +  co»Bk 
u  =  etna  co»0t  +  sina  sin0j  +  cosak 


(2.8) 
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Then  from  (2.8)  we  have 

cost  ss  c  os9  cos  a  +  »*n9  sina  cos{<p  -  j3).  (2.12) 

Define  the  angle  p  by 

(<Sx  D.  fc  oiia 

c  p  ~  ]zT7\  ’ k-  (2'13) 

From  (2.11)  we  obtain 

cospsinr  as  (<3  x  r  •  k). 

From  (2.8)  we  find 

cospsinr  =  sinf  sina(<p  -  f}) 

and  we  calculate 

|tT,J  s  usinr  tin  p.  (2.16) 

We  use  (2.10),  (2.11)  and  (2.16)  to  write  the  delta  function*  a* 

*(*-|f,|)d(w-?)*(lHtfF|)  =  S(R-r sin8)S(cosr)S(U —Or sinr sinp).  (2.17) 
We  now  need  to  change  the  variable*  of  integration.  We  have 
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dud?  =  dudr  sina  sin0  dad0dBdp.  (2.18) 

The  integrand  only  depends  on  ip  and  0  aa  tp  —  0.  We  can  set  V*  —  'P  —  0  and 
integrate  out  0.  Then  we  have 

cost  =  cos0  cosa  +  sin8  sinacosrp 

cospsinr  s  sin0  sina  sinrj)  •  (2-19) 

dud?  =  tduidr  sina  sind  dadrfrdB 
Now  we  must  change  the  variables  of  integration  from  a,  $  to  r,p: 

l  §7  §7\ 

dadrjj  s  det  I  ]  dr  dp.  (2.20) 

l  %L/ 

'  or  £p  ' 

From  (2.19)  we  eliminate  the  terms  containing  0  to  obtain 

cosa  =  cost  cosd  ±  sina(sin28  —  cos2  p)l.  (2.21) 

Define  functions  F(T>p,a,ip)  andG(r,  p,  a,0)  by 

F(r,p,a,V>)  =  cost  -  casOcosa  -  sinO  sinacosip 

(2.22) 

G(r,p,a,i>)  =  cospsinr  -  *in9  sina  sini> 

It  is  a  standard  result  of  Jacobian  transformations  (it  can  be  obtained  by 

using  formula  (2.20)  twice,  once  changing  variables  from  a,V>  to  F,G  and 

then  changing  F,G  to  r,p)  that 
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(OF 

w 

oa 

W 


From  (2.22)  we  obtain 


and 


(2.23) 


(2.24) 


Thus  we  calculate 


ain$  sina{»in$  eooa  -  eo»i>  cos#  since). 


(2.25) 


dadi/f  so 


oiirr  tinp 

oin8  sina(*in9  cosa  -  cob^cobB  sina) 


We  substitute  for  eos^  from  (2.19)  and  obtain 


(2.26) 


j  jjl  oiirrsinp  .  , 

~  sina{co*a  -  cobOcobt)  T  ^ 


(2.27) 


then  using  (2.21)  we  find 


.  j-s.  sinrsinp 

aacty  =s  —  - — r  drdn. 

*tna(*tn20  —  coo3p)t 


(2.28) 


Hence 


■  a ,  , ,  tinr  sinp  sinS 

Btna  atnodady  =  ...  ■■ ,  .  ■  ,drdp. 

(ain39  -  cosV)* 


(2.29) 
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We  substitute  (2.17),  (2.18)  and  (2.29)  into  (2.7) 


Vi(u)  =  J  6(R  -  rsin$)6(cosr)6( u  -  uratnr  sinp ) 

-  ,  V  sinrsinpsinQ  ,  ,  . 

#w(w) — - ~r  drdpdOdu  dr 

(stn29  -  cos2  p) * 

Now  we  do  the  integration  with  respect  to  r  to  obtain 


(2.30) 


u$(U)  =  j  *  —  sinf> sin 8 df> dt 9 ^ dr'  (2-31) 

We  integrate  with  respect  to  0  to  obtain 


Ui(u)-  f  6(<U  ~ursinP)*Uu)*™p(R/r)dpdu>dr 
J  (r2  -R2)i{R2/r2  -  cos2p)k  ‘ 


(2.32) 


We  integrate  with  respect  to  p  and  obtain 


f _ _ R*u(u)dudr 

J  r(r2  - 


(2.33) 


r(r2  -  R3)*(R2u; 3  +  u2  -  u3r2)i(u2r 2  -  u*)*' 

The  domain  of  integration  is  restricted  to  the  region  of  the  u  -  r  plane  where 
the  integrand  is  real  (for  simplicity  we  have  omitted  the  bounds  of  integration 
in  the  previous  equations).  It  is  specified  by 


r  >  R 

J2J u1  -f  u1  >  uf2r 3  (2-34) 

wr  >  u. 
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Change  the  variable  of  integration  from  r  to  X  where 


wV  *  ttJ  +  X 


(2.35) 


then 


Ru$M(u))(LjdX 

2X*(JPu;3  -  *)*(«»  +  X  -  w*  £*)*(«»  +  X) 


with  domain  of  integration 


(2.36) 


XSRW 


X>0- 


(2.37) 


t»J  +  X  >  u*R* 


Define  a  new  variable  Y  by 


Y*  =  v3  R3  -  X 


(2.38) 


then 


RjuQ„(J)<LjdY 

(u*R*  -  rJ)*(tt»  -  r»)*(«»  +w»iP  -  Y *) 


(2.39) 


with 
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Y  >0 

Y  <u)R  (2.40) 

Y  <  u. 

We  can  further  -simplify  this  expression  by  the  substitutions  Y  =  usind 
and  u  =  ( u/R)Cl .  This  yields 


•(II)  *4  /*d*  dSl ■ - MtiflAR) 


(2.41) 


It  is  clear  from  this  expression  that  *(u)  will  decrease  with  u  if  and  only 
if  $w(u;)  does  not  grow  faster  than  wJ.  Thus  for  all  realistic  cases  $(u)  will 
be  a  monotonic  decreasing  function  of  a. 


Section  3.  Expansion  and  Contraction. 

We  now  consider  the  situation  in  two  dimensions  when  the  rod  ip  allowed 
to  expand  and  contract  as  well  as  rotate.  We  can  write 

?=  (3.1) 

and  differentiating  gives 


r=rf  +  r0ft  (3.2) 

where  n  is  the  unit  vector  perpendicular  to  f.  We  define  the  expansion  coef¬ 
ficient  a  by 
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f  s  rs 


(3.3) 


and  write  (3.2)  as 


r  =  rsr  +  r8h. 


(3.4) 


The  projected  velocity  is  then  given  by 


U  =  r  a  sin8  +  rw  cosS  (3.5) 

where  8  is  the  angle  that  ?  makes  with  the  normal  to  the  image  plane  and 
8  =  u.  The  projected  velocity  distribution  is  given  by 


♦(If)  J  S{R  -  rsin8)i(U  -  (r  «  sin8  +  rueosf ))♦«(#)♦,*(<•/)  dr  du  da  dB 

(3.6) 

where,  as  in  Section  1,  we  have  assumed  that  #(r)  =  1  so  the  rods  have  no 
preferred  lengths.  We  do  the  integral  of  i{U)  with  respect  to  8  to  obtain 


♦(,)->’  -  R2)-l,78{U  -  a*-  w(r*  -  R2)t/2)*.(s)K(v)  dr  du  da.  (3.7) 
We  now  integrate  with  respect  to  r  to  find 


*(U)  =  J((V-  sR)2  +  u2R2)-ll2*t{s)*„{u)<L>ds.  (3.8) 


76 


Richards  Sc  Ullman 


Inferences  from  Images  6.S 


Care  most  be  taken  to  ensure  that  this  integral  is  evaluated  over  the  correct 
limits,  those  for  which  the  integrand  is  well  defined.  For  u>0we  need  *  <  % 
and  for  w  <  0  we  have  s  >  This  domain  of  integration  is  shown  in  figure 
(2).  We  can  split  the  integral  up  into  two  parts 


*(U)  =  *i(U)  +  *,(U)  (3-9) 


Figure  2  Tie  donuun  of  intefrmtion,  see  tert. 
where 


fMoo  rtmU/R 

*i(U)=  du  I  ((U-nR)t  +  R'u>r'''*.(s)*„(u)d3  (3.10) 

JvmO  JtM—OO 

and 
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(U)  =  r°  du  r~  ((U  -  »R )2  +  (3.11) 

J  uim—oo  JsmU/R 


We  impose  the  conditions 


#.M  -  #.(-•) 


(3.12) 


and 


#«(«)  =  *w(-w).  (3.13) 

These  ensure  that  contraction  is  as  likely  as  expansion  and  that  rotation  is 
equally  likely  to  be  any  direction. 

We  use  (3.12)  and  (3.13)  to  rewrite  (3.9)  as 


/wmoo  ftmU/R 

du  {{U-  *R)2  +  R2u2)-'f2*.(s)*Uv)ds 

rmO  J  tm-oo 

rwaoo  rom—U/R 

+  du  ((U  +  »R)2  +  R,u2)~x'2*.(s)*w{u)ds.  (3.14 «) 

J  wmO  Jtm-oo 

We  now  substitute  t  =  Rs/u  and  u;  =  u Cl/R  and  obtain 


*(U)  =  [nm°°dn  ftml  ((i-t)3  +  n2ri'2*.(ut/R)*u(un/R)dt 

"  J  O«0  Jtm-oo 

+  T57  /  me°dCl  f9~\{  1  +  tf  +  n2rl'2*.{ut/R)*vW/R)dt.  (3.14b) 

K  J  OmO  Jtm-oo 

Note  that  if  we  set  ♦,(« t/R)  =  6(ut/R)  we  recover  (1.3). 
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We  differentiate  (3.14a)  to  obtain 

ax  [u=cc  rs=U/R 

£1  =  -  /  <L>  I  ((U  +  3R)2  +  R2u2)~3f2{U  +  a*)$,(a)$w(u>)da 

OU  Jw-o  Jts- 00 

rwc  00  f  t= —U  /  R 

-  du  /  ((U-sR)2  +  R2u2)-3'2(U-3R)*.(3)*„(u)ds.  (3.15) 

J  wm  0  J  isc—  00 

We  can  simplify  this  expression  by  setting  a  to  a  —  2£/ /R  in  the  second  term 
on  the  right  hand  side.  This  yields 

w = -  OCT  w + 

(3.16) 

In  the  range  of  integration  (a  +  a£)  is  always  negative.  So  the  integral  will 
be  always  negative  provided  $,(a)  -  ♦,( a  +  2u /R)  is  negative  for  a  <  —u/R. 
If  we  make  a  positive,  using  (3.12),  this  becomes 

*,(a  -  2U/R)  -  i,(s)  >  0,  (3.17) 

which  is  true  for  all  monotonically  decreasing  functions.  Then  we  have  <  0 
everywhere  for  any  function  $w(u>).  Hence  the  projected  velocity  distribution 
will  once  more  be  peaked  about  U  =  0. 

If  we  relax  the  conditions  (3.17)  we  are  no  longer  assured  of  having  a 
monotonically  decreasing  projected  velocity  distribution  for  all  ♦w(‘*>).  If 


(u)ds. 
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$,(*)  is  linear  in  s  we  obtain  a  flat  distribution 

$(U)  =  const.  (3.18) 

If  ♦,($)  is  quadratic  in  s  we  find 

|i=-2  +  J  Ji (3-19) 

which  will  only  be  negative  for  some  distributions 

Thus  allowing  the  rod  to  expand  and  contract  reduces  the  strength  of  the 
result  of  Section  1.  But  if  we  make  the  reasonable  assumption  that  $•(*)  “ 
at  most  linear  in  s  the  result  will  still  hold. 

Section  4.  Curves  from  rods. 

The  situation  becomes  more  complex  when  we  consider  several  rods 
joined  together  to  form  a  curve.  The  results  above  show  that  for  each  rod 
individually  the  projected  velocity  distribution  is  peaked  at  zero.  If  the  rods 
are  joined,  however,  these  probability  distributions  are  no  longer  independent. 
We  now  sketch  an  argument  suggesting  that  this  loss  of  independence  does 
not  affect  the  result. 

Suppose  we  have  a  set  of  probability  distributions  Pj(xj),  <  =  1,  ..N 
which  are  all  peaked  at  zero.  We  now  define  the  joint  probability  distribution 
P(£)  where  £=  (*j,xj,.. .,**).  If  we  impose  no  constraints  this  distribution 
is  given  by 
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P(x)dx  =  njPi(xj)IIjdx,. 

Now  impose  a  consistency  constraint  *<  =  0.  We  now  have 


(4.1) 


J  P(x)dx  =  J  niPi(xi)6('£'Xi)Tlidxi.  (4.2) 

The  Xj’s  can  no  longer  vary  independently  and  the  delta  function  imposes  the 
consistency  constraint.  We  can  get  integrate  out  the  delta  function  thereby 
reducing  the  number  of  independent  variable  to  N  —  1.  It  is  convenient  to 
choose  these  to  be  x*  -  x<+i,  for  i  =  1,..., N  —  1.  We  can  now  write  (4.2)  as 

J  P(x)dx  =  J  UiPiixi  -  xi+1)IM(x<  -  xi+1).  (4.3) 

We  can  set  ~  x,  -  x,+j»  »  =  1,  ...,N  -  1  and  rewrite 


P(y)  =  Pt(yi)Pi(yt)'"PN-i(VN-i)Prf(-yi  —  ifa  —  —  —  yv-i)-  (4.4) 

It  is  straightforward  to  see  that  the  maximum  value  of  P{y)  occurs  when  the 
Vi  are  all  zero.  Thus  P  is  still  peaked  at  zero  even  with  the  constraints. 

Section  5.  An  alternative  approach. 

The  results  derived  in  the  last  few  sections  support  the  assumption  (Hil¬ 
dreth  1984)  that  the  velocity  field  along  a  contour  is  smooth.  This  assumption 
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is  used  to  solve  the  aperture  problem  by  prescribing  the  smoothest  possible 
velocity  field  consistent  with  the  data. 

An  alternative  approach  to  the  aperture  problem  was  taken  by  Waxman 
and  Wohn  (1986).  They  showed  that  if  the  object  was  locally  planar  and 
moving  rigidly  then,  assuming  perspective  projection,  the  local  velocity  field 
on  the  image  plane  would  be  locally  quadratic  in  the  x,  y  coordinates.  They 
then  defined  regions  in  which  these  expansions  were  valid  and  checked  for 
consistency  between  these  regions.  In  these  regions  they  could  do  a  least 
squares  best  fit  to  find  the  coefficients  of  these  quadratic  polonomials,  and 
hence  the  velocity  field. 

This  approach,  however,  can  also  be  used  to  justify  the  smoothest  velocity 
field  assumption.  The  orthographic  projection,  of  the  motion  of  a  rigid  body 
will  obey 


vx  =  A  +  By  +  Cz,  (5.1a) 

vv  =  D  +  Ex  +  Fz,  (5..*5) 

where  A,  B,C ,  D ,  £,  F  are  constants  and  z  is  the  (unknjwn)  depth  coordinate. 
If  we  also  assume  the  object  is  planar  then  z  obeys 
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z  =  px  +  qy  +  s,  (5.2) 

where  p,g,r  are  constants.  Substituting  (5.2)  into  (5.1)  gives  linear  expres¬ 
sions  for  vc  and  in  terms  of  x  and  y.  So  if  we  assume  local  planarity  and 
local  rigidity  (in  the  style  of  Waxman  and  Wohn  (1986))  the  velocity  fields 
will  locally  be  linear  in  x  and  y.  The  measure  of  smoothest  velocity,  /(v), 
used  by  Hildreth  is  the  integral  along  the  curve  of  the  function 


. ,  -  _  dv  dv 
J ^  “  da  ‘  da' 


(5.3) 


where  a  is  the  arc  length.  The  velocity  fields  will  be  locally  linear  if  and  only 
if  J(v)  is  zero.  Thus  we  can  think  of  the  smoothest  velocity  field  approach  as 
a  local  method  of  assuming  local  rigidity  and  local  planarity. 


Sectio  i  6.  Conclusion. 

The  arguments  in  the  first  four  sections  of  this  paper  suggest  that  locally 
rigid,  or  semi-rigid  objects,  will  tend  to  project  a  smooth  velocity  field  in 
the  image.  Moreover,  assuming  random  motions  and  limited  expansion  or 
contraction,  this  field  will  tend  to  be  a  smooth  as  possible  In  the  fifth 
section  we  noted  that  rigidity  of  an  object  and  local  smoothness  of  its  surface 
will  also  lead  to  a  smooth  image  motion. 
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Tnese  arguments  support  the  view  that  maximixing  smoothness  is  a  good 
heuristic  to  use  for  motion  correspondence  and  that  it  is  a  sensible  way  to  solve 
the  aperture  problem. 
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Appendix 

We  first  describe  the  general  method  for  integrating  delta  functions. 
Suppose  we  have  an  integral  I(a,b) 

J(a,  6)  =  ^ /(*)*(* -x0)d*  (A.1) 

where  6(z)  is  the  Dirac  delta  function,  f(x)  is  an  arbitrary  function,  zo  an 
arbitrary  point  and  b  >  a.  The  value  of  the  integral  is 

/(a, 6)  =  /(*o),  if  *o€[o,6]  {A.2a) 

/(a, 6)  =  0,  otherwise.  (A.26) 

This  result  can  be  generalized  to  integrals  of  form 
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J{a,b)  =  j  f(x)t(g(x)-c)dx  (,4.3) 

where  g(x)  is  an  arbitrary  function  and  e  an  arbitrary  number.  All  points  x{ 
with  g(xi)  =  e  will  contribute  to  this  interval.  SnppoBe  there  is  only  one  such 
point.  If  there  are  several  we  can  divide  the  integral  up  into  regions  with  only 
one  such  point.  Consider  one  such  point  *  =  0.  The  function  g{x)  can  be 
expanded  in  a  Taylor  series  about  this  point 

g(x)  =  c  +  g'(0)(x)  +  0(x2 ).  ( A.4) 

If  we  change  the  coordinate  to  u  where  u  s  ^(O)*  we  can  write  the 
integral  as 

/■s'(°)k  du 

J(a,6)  =  /  /(«/*'(  0M«  +  0(«*))-^L.  (A.5) 

o)«  ^(0) 

The  value  of  the  integral  will  depend  on  the  sign  of  ^(0).  If  it  is  negative 
the  bounds  of  the  integral  will  be  reversed  and  the  '  <^ral  will  change  sign. 
Therefore 


=  /(0)  1^0)1*  ( ,4.6a) 

J(a,b)  =  Q ,  otherwise.  ( ASb ) 
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We  now  consider  the  form  of  the  probability  distribution  function  for  a 
rotational  symmetric  vector  in  two-dimensional  space.  Suppose  the  function 
is  &a(u)du.  If  it  is  rotationally  symmetric  (and  hence  depends  only  on  the 
modulus  u>  of  u)  it  can  be  written 


$c(Q)d.L!  =  iu(u)udud<p,  (A.7) 

by  changing  to  radial  coordinates  u >,<p  in  the  u  space.  The  <p  component  can 
be  integrated  out  to  give  a  distribution  u$u(u)du>. 
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7.0  Assessment  and  Future  Directions 

Computer  Graphics  provides  a  powerful  too!  for  understanding  vision.  Although  it 
is  exceedingly  time  consuming  to  generate  realistic  models  of  a  scene  from  physical 
principles,  the  rewards  are  great.  There  is  no  other  method  available  at  present 
of  comparable  power  that  allows  us  to  isolate  the  physical  parameters  relevant 
to  “seeing”  natural  surfaces  or  objects.  Renditions  now  exist  for  many  classes  of 
plants  and  trees,  and  for  several  kinds  of  natural  phenomena  like  water,  clouds  and 
sunsets  (see  SIGGRAPH  and  Fournier  k  Reeves,  1987).  These  programs  present 
an  unusual  opportunity  for  the  visual  psychophysicist  who  wishes  to  understand 
the  perception  of  natural  surfaces,  and  how  inferences  about  material  types  and 
objects  can  be  made  from  images. 

One  way  to  further  the  Graphics  Psychophysics  approach  would  be  to  encour¬ 
age  psychophysicists  to  collaborate  with  those  laboratories  actively  engaged  in  the 
generation  of  natural  phenomena  using  graphics.  Typically  these  graphics  labs  are 
interested  in  selecting  the  best  rendition  parameters,  but  do  not  bother  to  conduct 
parametric  studies  which  the  psychophysicist  is  trained  to  do.  Graphics  laborato¬ 
ries  at  Cornell  (Architecture),  University  of  North  Carolina,  Toronto,  Cal.  Tech., 
NYIT  and  MIT-Media  might  be  approachable  hosts.  A  special  sabbatical  or  post¬ 
doc  fellowship  to  a  young  investigator  would  encourage  this  collaboration  at  little 
cost.  Psychophysical  studies  of  the  rendition  of  even  a  half-dozen  different  materials 
would  greatly  advance  image  understanding. 

Finally,  this  project  only  began  to  explore  time-dependent  phenomena,  namely 
the  motion  and  creation  of  water  waves,  plus  some  semi-rigid  motions.  This  rel¬ 
atively  unexplored  area  of  non-rigid  motions  should  be  contrasted  with  the  large 
number  of  studies  which  are  devoted  to  rigid  body  motions  or  to  flow  fields.  Non- 
rigid  motions  include  not  only  the  wave  motions  of  fluids  such  as  water,  honey, 
milk,  oil,  etc.  (which  we  can  tell  apart  psychophysically),  but  also  turbulent  mo¬ 
tions,  deformations,  and  articulated  motions.  Here  we  also  have  lacunae  well  worth 
filling. 
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Appendix  I:  Relating  Fractal  Surfaces  to  Fractal  Images 


Our  aim  is  to  recover  a  3D  fractal  measure  of  surface  roughness.  As  a  first  step,  we 
need  to  show  that  the  resultant  image  contains  a  measure  related  to  the  3D  fractal 
specification  for  the  surface.  Pentland  (1983)  and  Kube  k  Pentland  (1986)  have 
already  shown  that  a  convex  Lambertian  3D  surface  of  constant  albedo  with  a  spa¬ 
tially  isotropic  fractal  Brownian  shape  seen  under  constant  illumination  produces 
an  image  whose  intensity  surface  is  also  fractal  Brownian  with  a  fractal  dimension 
identical  to  that  of  the  components  of  the  surface  normal.  Specifically,  given 

I(z,y)  =  pE(NL)  (1) 


where  p,  E,  L  are  constant  and  IV  is  a  Brownian  fractal,  then  I(x,  y)  will  be  fractal. 

We  wish  to  prove  a  similar  result  for  glossy  surfaces.  In  particular,  we  require 
that  I  will  obey  the  rule 


Pr\ 


I{x,y)-I( z  +  Az,  y ) 

HAsf 


<  u  =  F(u) 


(2) 


where  h  is  the  Hausdorf  dimension  and 


I{z,y)  =  E(NH)Fheanel  (3) 

where  H  =  L  +  V  (i.e.  bisector  of  illuminant  and  viewer  directions).  Pfresnel  is  the 
Fresnel  reflectance  function.  Unfortunately,  Ffresnei  is  dependent  upon  the  relation 
between  N  and  L  (incident  angle),  even  when  N  and  H  align,  which  is  Beldom  for 
point  sources.  Together,  these  factors  generally  will  prohibit  a  point  source  from 
generating  specularity  regions  that  satisfy  a  fractal  image  statistic. 

However,  most  scenes  are  illuminated  also  by  diffuse,  hemispheric  sources. 
Under  these  conditions,  disregarding  occlusions,  for  every  microfacet  viewed  there 
will  be  a  patch  of  the  hemisphere  reflected  off  the  facet.  Equation  (3)  now  becomes 

I(z,y)  =  E*Ffr„Ml(NV)  (4) 

where  the  Fresnel  term  is  a  function  only  of  the  emittant  angle,  which  equals  the 
incident  angle,  and  E *  is  illumination  per  solid  angle. 

We  next  approximate  the  Fresnel  reflectance  by 

F  =  (l  —  V -N)n  =  (l  —  cos  9)n  (5) 

where  n  is  a  parameter  related  to  the  index  of  refraction,  and  controls  the  rate  at 
which  the  Fresnel  reflectance  falls  toward  zero  as  the  light  beam  moves  away  from 
its  maximum  reflectance  of  1.0  for  the  grazing  condition.  For  incident  angles  9 
greater  than  60°  we  can  approximate  (5)  by  its  Taylor  Series  expansion,  retaining 
only  the  first  terms: 
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F  =  l-  nV  N 


(6) 


Equation  (6)  is  then  substituted  into  (4),  which  in  turn  can  be  used  to  test  for 
condition  (1).  Dividing  out  the  common  factor  Em,  we  find  that 


Pr 


t1 


-n N-V  -l  +  n(N  +  AN)  V 
llAzf 


F(u) 


or 


<«  =F(U) 

V  l|Az||h 


(7) 


(8) 


But  as  long  as  V  is  a  constant,  which  it  is  for  viewing  glossy  surfaces  at  sufficient 
distances,  then  the  fractal  nature  of  N  will  result  in  a  scaled  version  of  (8)  satisfying 
condition  (l).  Hence  glossy  reflections  off  a  fractal  3D  surface  seen  under  hemifield 
illumination  will  produce  a  fractal  image-intensity  surface. 
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Appendix  II  Recovering  Material  Properties  from  Sound 
(R.P.  Wildes) 

1.0  Introduction 

Physical  modeling  of  world  events  occupies  an  important  position  in.  computational 
approaches  to  perception  and  motor  control  (Brady  et  al.,  1982;  Horn,  1970;  Marr, 
1982;  Richards,  1988).  Such  models  are  dual  in  purpose.  First,  they  lend  to  a  precise 
statement  of  the  problem  under  consideration.  Second,  they  expose  constraints  on 
the  solution  space,  which  indicate  how  the  problem  can  be  solved.  Here,  we  use  this 
approach  for  an  initial  attack  on  a  previously  unsolved  problem  in  audition:  How  it 
is  possible  to  recover  the  material  property  of  an  object  from  the  sound  generated 
when  it  is  struck? 

Consider  some  examples.  When  you  hit  a  drinking  glass  with  your  finger-nail 
it  produces  a  distinct  “ringing”  sound.  In  comparison,  if  you  strike  a  log  with  your 
knuckles  it  gives  off  a  short  “thud” .  In  either  case  you  easily  identify  the  material 
of  the  struck  object.  There  are  other  cases  where  the  sound  from  striking  an  object 
leaves  one  with  no  clear  impression  of  material  type.  Objects  which  are  clamped  or 
otherwise  artificially  damped  often  belong  to  this  latter  class.  Our  goal  is  then  two¬ 
fold:  First,  we  seek  to  discover  a  physical  parameter  of  the  Bound  following  impact 
which  is  intrinsically  related  to  material  type.  Second,  we  desire  to  recognize  those 
situations  where  such  material  identification  is  not  possible.  Guided  by  a  model 
of  vibrating  solids  we  shall  choose  to  satisfy  these  goals  with  a  measure  of  energy 
dissipation  during  vibration — an  intrinsic  material  property. 


2.0  A  Physical  Model 

When  a  solid  object  is  subject  to  an  impact,  much  of  the  resultant  sound  is  due  to 
the  vibration  of  the  object.  Therefore,  we  study  the  mechanics  of  vibrating  solids, 
and  take  that  motion  as  analogous  to  the  sound  production.  In  the  following 
presentation  we  do  not  consider  the  initial  transient  due  to  impact.  Instead,  we 
concentrate  on  the  steady-state  and  damped  behavior  of  the  solid.  We  proceed  in 
two  stages.  First,  we  introduce  some  basic  concepts  concerning  the  deformation  of 
solids.  Second,  we  develop  a  particular  model  of  a  vibrating  solid:  the  standard 
anelastic  linear  solid  described  by  Zener  (1948). 1 

JMany  of  the  results  in  Sections  2.1,  2.2  and  3.1  are  known  in  the  literature  on  anelas- 
ticity.  For  further  details  see  Nowick  <k  Berry,  1972;  Wert,  1986;  Zener,  1948. 


90 


Richards  &  Ullman 


Inferences  from  Images 


2.1  Basic  Concepts 


Hooke’s  law  states  that  an  ideal  elastic  material  can  be  described  by  the  relation 

o  =  Me  (1) 

Where:  a  is  a  stress  variable  (corresponding  to  a  force);  £  is  a  strain  variable 
(corresponding  to  a  displacement);  and  Af  is  an  appropriate  modulus  of  elasticity.2 
In  many  situations  it  is  convenient  to  define  J  =  as  the  modulus  of  compliance. 
Then  (l)  becomes 

t  —  Jo  (2) 

Elastic  behavior  so  defined  is  seen  to  be  characterized  by  three  conditions.  First, 
there  is  a  1  :  1  correspondence  between  stress  and  strain.  Second,  the  stress-strain 
relationship  is  linear.  Third,  there  is  no  time  dependence  in  the  mapping  between 
stress  and  strain.  Clearly,  any  real  solid  is  only  approximately  characterized  by 
these  conditions.  In  most  cases,  condition  three  is  a  particularly  erroneous  assump¬ 
tion.  Therefore,  we  proceed  by  considering  cases  where  there  is  a  time  dependence 
between  stress  and  strain.  Solid  materials  characterized  by  such  a  time  dependence 
are  called  anelastic  solids. 

Suppose  a  specimen  is  subjected  to  a  periodic  stress 

a  =  <ro«’wt  (3) 


Here:  <70  is  the  stress  amplitude;  u  =  2x/  is  the  angular  frequency;  t  is  time;  and 
»  =  y/—l-  Then,  our  assumptions  of  linearity  and  time  dependence  dictate  that 
strain  has  the  form 


£  =  «o « 


«(wt  —  tf>) 


(<) 


with  eo  the  strain  amplitude  and  4>  a  phase  angle.  Then,  <j>  captures  the  notion 
of  a  time  dependence  between  stress  and  strain.  For  subsequent  operations  it  is 
convenient  to  separate  (4)  into  its  real  and  imaginary  parts.  Letting  £j  be  the 
component  of  strain  in  phase  with  stress  and  be  that  component  of  strain  ^  out 
of  phase  with  stress  we  have 

t  -  (£i  -  t£2K“‘  (5) 

Now,  in  analogy  with  (2)  we  define  the  complex  compliance 

J(u)  =  ~  (6) 

Then,  dividing  (4)  by  (3)  gives 

2Notice  that  for  a  fine  grained  analysis  <r  and  t  would  be  second  order  tensors.  For  our 
purposes  it  suffices  to  consider  them  as  scalars. 
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J(u)  =  — <^(w)  (7) 

<To 

The  complex  compliance  can  also  be  represented  as  a  sum  of  real  and  imaginary 
parts  by  dividing  (5)  by  (3)  and  defining 

i/(w]  — —  J\  (w) 


We  proceed  by  expanding  (7)  in  terms  of  Euler’s  relation3  and  then  equating 
read  and  imaginary  parts  of  (7)  and  (8)  to  get 


Ji(u)  =  —  cos  ^(w) 
<ro 

J^u)  —  —  sin 

<ro 


(9) 


From  which  we  reach  the  important  result  that 

♦  ^  J2 


(10) 


In  the  anelasticity  literature  tan  <j>  is  referred  to  as  internal  friction.  Internal  friction 
is  an  intrinsic  property  of  a  given  material;  it  measures  the  degree  of  anelasticity. 
We  shall  ultimately  recover  a  measure  closely  related  to  internal  friction  as  our 
characterization  of  solid  materials. 

Finally,  it  is  worth  noting  that  analogous  calculations  can  be  carried  out  in 
terms  of  the  modulus  of  elasticity.  Such  derivations  would  begin  by  defining 

M(«)  =  f  (11) 

with  the  ultimate  conclusion  that 

tan^=Mf  (i2) 


2.2  The  Standard  Anelastic  Linear  Solid 

The  model  of  solids  that  we  wish  to  consider  is  depicted  as  a  mechanical  structure 
in  Figure  1.  We  refer  to  this  model  as  the  standard  anelastic  linear  solid.  The 
model  consists  of  three  elements.  Elements  a  and  b  are  Hookean  springs;  let  them 
have  corresponding  compliances  Ja  and  Jj,.  Element  c  is  a  Newtonian  dashpot, 
which  serves  to  provide  internal  friction.  A  Newtonian  dashpot  can  be  thought  of 
as  a  plunger  moving  in  a  viscous  liquid.  Its  velocity  of  motion  is  proportional  to 
the  applied  force.  In  terms  of  stress  and  strain 

o  =  F?t  (13) 

3e*9  =  cos0  +  isintf 
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Figure  1.  A  mechanical  model  of  the  standard  anelastic  linear  solid.  Ja  and 
Jb  are  compliances  of  the  two  springs,  n  is  the  viscosity  of  the  “liquid”  in  the 
daehpot 


with  rj  the  viscosity  of  the  liquid.  For  our  purposes  we  let  n  =  where  ra  is 
the  time  required  for  the  system  to  achieve  equilibrium  when  a  constant  stress 
is  applied.  Notice  that  this  model  captures  our  intuitive  notion  of  what  happens 
when  a  force  is  applied  to  a  solid:  Initially  the  solid  deforms  (element  a).  With 
time  further  deformation  takes  place  (element  c).  When  the  force  is  removed  the 
solid  returns  to  normal  (elements  a  and  b).* 

In  order  to  be  more  precise,  we  characterize  the  behavior  of  our  model  in  terms 
of  the  differential  equation 

ao<r  +  ai&  —  boe  +  bii  (14) 

where  the  derivatives  are  with  respect  to  time.  We  begin  by  setting 

fo  =  JaPa 

f6  =  Mb  (15) 

(c  =  Jb  — 

T<r 

4In  some  cases  the  solid  never  returns  to  its  original  configuration.  This  introduces  the 
concept  of  visco-elasticity;  we  shall  not  consider  such  cases. 
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Figure  2  A  second  mechanical  model  for  the  standard  anelastic  linear  solid. 


We  now  introduce  the  standard  rules  for  combining  multiple  stresses  and  strains: 
For  series  combination,  stresses  are  equal  while  strains  add.  The  reverse  is  true  for 
parallel  combinations.  Applying  these  rules  to  the  model  of  Figure  1  gives 


e  =  ea  +  e  b 

ffe  =  Cc 
o  —  <ra 

_ 

“  Ja 

=  o-fc  +  <rc 
_  lye c 

~  7b 

We  can  rearrange  (16)  to  the  form  (14)  as 

J<7  +  TaJab  —  e  +  T£ 


(16) 


(17) 


where  J  —  Ja  +  Jb- 

We  now  characterize  our  model  in  terms  of  the  standard  measure  of  internal 
friction  tan<£.  Following  methods  similar  to  those  of  Section  2.1,  we  substitute  (3) 
and  (5)  into  (17)  and  equate  real  and  imaginary  parts  to  find  that 


Ji  =  Ja  + 


J2  =  Jb 


Jb 


1  +  fur,)2 
wr<7 


1  +  (wrff)2 


(18) 


Then,  upon  substituting  into  (10)  we  get 


tan  4>  —  Jb 


UTer 

J  +  ^a(u/r„)2 


(19) 
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We  continue  by  deriving  a  more  convenient  form  for  tan  <f>.  Consider  the  model 
of  Figure  2.  This  model  can  be  shown  to  be  formally  equivalent  to  the  model  of 
Figure  1.  It  is  customary  to  represent  this  latter  model  in  terms  of  the  elastic 
moduli  and  the  time  to  strain  equilibrium  re.  Following  the  methods  used  above 
and  using  analogous  symbols,  we  can  find  that 


M1  = 

Afj  =  Mb 


Ma  +  Mb(ur()2 
1  +  (wre)2 

( JTe 


(20) 


*1  +  (wre)s 

Finally,  to  get  our  simpler  expression  for  tan^  we  combine  (18),  (19)  and  (20)  to  arrive 
at 


tan  ^  ss 


Jb _ wr 

(JaJ)2  1  +  («r)5 


(21) 


with  r  the  geometric  mean  of  r„  and  re . 

We  conclude  this  section  with  a  recapitulation.  In  this  section  we  began  by 
introducing  the  concepts  of  stress,  strain  and  internal  friction.  We  proceeded  to 
present  the  standard  linear  model  of  anelastic  solids  in  terms  of  these  three  param¬ 
eters.  In  Section  3  we  make  use  of  these  relations  to  classify  solid  materials  from 
the  sound  they  produce  following  impact. 


S.O  Recovery  of  Material  Type 

When  an  anelastic  solid  is  struck  and  set  into  vibration  its  behavior,  and  hence  the 
sound  it  generates  following  a  brief  transient,  is  dictated  by  our  standard  linear 
model.  Our  goal  is  to  extract  some  intrinsic  parameter  of  the  solid  material  from 
this  dynamic  behavior.  As  noted  earlier,  internal  friction,  tan  <f>,  is  just  such  a 
measure.  We  now  proceed  to  derive  two  measures  of  internal  friction  and  then 
show  how  they  can  work  together. 


S.l  Internal  Friction  and  Bandwidth,  Q  1 

We  begin  by  relating  internal  friction  to  peak  vibration  of  our  standard  linear 
model.  To  facilitate  this  set  of  derivations,  we  follow  (Zener,  1948)  and  momentar¬ 
ily  consider  the  mechanical  system  as  having  only  one  degree-of-freedom.  This  is 
depicted  in  Figure  3.  Here  we  have  lumped  the  inertia  of  the  system  into  a  single 
member  I,  while  the  anelastic  and  elastic  components  are  lumped  into  an  anelastic 
spring  with  complex  modulus6 

6It  can  be  shown  that  an  arbitrary  standard  anelastic  linear  system  in  vibration  can 
always  be  so  reduced,  see  (Nowick  <fc  Berry,  1972,  appendix  A.) 
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Figure  S  A  lumped  model  of  Figures  1  and  2. 


M  =  A/(l  +  ttand)  (22) 

From  Newton’s  law  we  observe  that 

Ie  =  <t  (23) 

Assuming  periodic  motion,  we  also  have  that 


e  =  e0e 


iuit 


and  as  always 

<7  =  Me 


We  can  solve  for  e  by  substituting  (22),  (24),  and  (25)  into  (23)  to  get 

a 

e  - - 2S - 

(l  ~  +  »tan  0 

By  inspection,  we  find  (26)  at  a  maximum  when 


Further  consideration  shows  that  (26)  is  at  2~^  maximum  amplitude  when 

u?  =  y(l  ±  tan  4>) 

We  arrive  at  our  first  measure  of  internal  friction  by  noting  that 


wo 


tan  ^ 


(24) 

(25) 

(26) 

(27) 

(28) 

(29) 
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where  we  use  Q  in  accordance  with  the  corresponding  concept  from  electrical  circuit 
theory.  Thus,  internal  friction  can  be  measured  in  terms  of  the  sharpness  of  the 
peak  around  the  maximum  of  vibration  of  our  specimen.  This  will  correspond  to  a 
sharpness  in  the  peak  of  the  acoustic  signal  generated  by  this  vibration.  It  is  this 
acoustic  peak  which  we  suggest  measuring. 


S.2  Internal  Friction  and  Decay  Rate  te 


We  now  seek  a  measure  of  internal  friction  in  terms  of  the  decay  in  amplitude  of  a 
vibrating  anelastic  solid.  Combining  (22),  (23)  and  (25)  we  find  that 

It  +  M(1  + » tan  ^)f  =  0  (30) 


describes  the  behavior  of  our  standard  linear  system  as  the  vibration  decays.  The 
solution  to  (30)  is  found  to  be 


e  =  <o« 


(31) 


with  A  =  ln(AnfAn+i),  where  An  and  An+i  are  successive  peaks  in  the  amplitude 
of  the  decaying  tone  of  frequency  /.  (A  is  called  the  logarithmic  decrement.)  We 
solve  for  the  internal  friction  by  substituting  (31)  into  (30)  and  equating  imaginary 
parts  to  find  that 

=  tan  4>  (32) 


Because  of  the  potential  difficulties  of  measuring  two  successive  amplitudes,  we 
proceed  a  step  further.  Letting  te  denote  the  time  required  for  the  amplitude  to 
decrease  to  A  of  its  original  value,  we  note  that 

1  2  * 


te  —  TT  ~ 


/  A  uA 


(33) 


Rearranging  and  substituting  (33)  into  (32)  then  yields 


(34) 


Our  second  measure  of  internal  friction  is  then  given  in  terms  of  the  time  it  takes 
the  amplitude  of  vibration  (and  therefore  of  the  sound)  to  decrease  to  some  fraction 
of  the  original  amplitude. 
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S.S  Specific  Loss 

We  now  have  in  hand  two  measures  of  internal  friction  (29)  and  (34).  Examination 
of  these  relations  reveals  that  tan  is  a  function  of  frequency.  For  our  purposes, 
it  would  be  more  convenient  to  have  a  single  characterization  of  anelastic  behavior 
rather  than  an  entire  “damping  spectrum” .  Such  a  measure  is  available  in  terms  of 
loss  per  cycle,  known  as  “specific  loss” .  To  motivate  this  measure  we  notice  from 
(21)  that  for  (ur)2  »  l,  u>  tan  <f>  is  essentially  independent  of  frequency.  Empirical 
investigations  of  solids  (Bennewitz  ic  Rotger,  1936;  Gemant  ic  Jackson,  1937)  con¬ 
firm  that  this  relation  is  true  for  a  very  wide  range  of  vibrational  frequencies  and 
materials.  Therefore,  we  define  specific  loss  as 

S  =  w  tan  <f>  (35) 

as  our  measure  of  anelasticity  and  as  our  key  to  classifying  solid  materials  by  sound. 


S.4  A  Conatraint  Curve 

The  final  task  we  face  is  to  develop  a  method  of  telling  when  our  two  measures 
of  internal  friction  will  yield  an  accurate  characterization  of  a  given  sample.  For 
example,  let  us  say  an  object  is  struck  and  our  auditory  system  recognizes  its  pitch 
ss  Ag  =  880  Hz  with  a  decay  rate  of  1  sec  for  te.  By  equation  (34)  we  can  recover 
a  measure  of  internal  friction,  and  hence  characterize  the  material.  But  how  do  we 
know  this  answer  is  correct?  To  confirm  that  the  material  behaves  as  our  standard 
anelastic  linear  solid,  we  need  a  second,  corroborating  measure.  But  of  course,  this 
can  be  obtained  by  recovering  the  bandwidth,  Q-1,  of  the  sound  about  880  Hz, 
using  equation  (29).  In  this  case,  if  each  measure  does  not  yield  the  same  value  for 
tan  4>,  then  we  must  disregard  this  characterization. 

By  clotting  te  versus  Qu~l  as  shown  in  Figure  4,  we  can  construct  a  constraint 
line  tha .  colds  for  materials  obeying  our  standard  anelastic  linear  model.  The  line 
is  independent  of  frequency  because  each  measure  of  internal  friction  has  been 
recast  as  “specific  loss"  (35).  Clearly,  all  valid  measurements  must  lie  along  this 
constraint  curve.  Here  we  show  how  particular  material  types  spread  out  along  the 
curve  with  metal  at  the  upper  right  and  rubber  at  the  lower  left.  The  data  have  been 
collected  from  Gemant  ic  Jackson  (1937),  Berg  ic  Stork  (1982)  and  the  Handbook 
of  Chcmiatry  and  Physics.  Exactly  how  fine  a  distinction  can  be  realistically  made 
along  the  constraint  curve  is  largely  an  empirical  question  (Waller,  1938;  Warren 
ic  Verbrugge,  1984). 
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Figure  4  The  constant  curve  relating  the  two  measures  of  anelasticity  for  our 
standard  linear  solid  model.  Decay:  te ;  tuning  width:  Q~x- 


4-0  Summary 

We  have  described  how  to  make  auditory  measures  of  an  intrinsic  parameter  of  solid 
materials.  The  parameter  chosen  is  called  specific  loss,  and  is  related  to  internal 
friction,  a  measure  of  anelasticity  for  a  given  material.  The  two  measures  presented 
are  related  to  the  width  of  the  resonant  peak  of  a  sound  and  the  time  of  its  decay. 
Crucial  to  our  development  has  been  an  understanding  of  a  standard  anelastic  linear 
system,  a  physical  model  of  the  mechanics  of  solid  vibration.  We  hope  this  work 
demonstrates  the  usefulness  of  using  an  explicit  physical  model  of  sound  production 
in  attempts  to  study  sound  recognition. 
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